How WooCommerce Stores Your Data

plastic bins
A real world data store, as seen at Joann’s on 8/7/18

This post is the kickoff for a linked set of posts regarding the underlying WP/WOO schema.

The target audience for this are Data Simpletons, such as myself, who look at the underlying WP/WOO and basically say to themselves:  WTF?

These posts are not meant for database gurus, or PHP wizards, or WP experts — all of whom are no doubt far more knowledgeable than I, a simple-minded end-user of the WOO product, who slaved for three years like an oarsman in a Roman galley ship in support of his wife’s e-commerce store.

My remarks are mainly designed to appeal to the person who knows WP and the WOO environment, is somewhat familiar with database concepts, has a facility with spreadsheets in the MS Windows environment, and enjoys data wrangling.

As an aside, I will say that in my former life as a data modeler, I soon grasped that if I expected to be paid for completing some project, I had to understand just enough of a problem domain in order to be able to specify something useful for a customer, as opposed to understanding how everything works — which is the path to madness, frustration and penury.

The WP framework is extremely elaborate and complex, and my goal here is not to try to explain it.

Rather, I want to describe what my business goals might be, in a generalized form of course, then take a step back and assess the WP/WOO env, followed by a conceptualization of how I plan to achieve my business goals.

Above all, I am going to try to keep this as simple as possible, but never too simple as to be essentially useless.  Inevitably things will get more complex as our understanding of how to attack the problem domain evolves and becomes more sophisticated.

Let’s start with some rudimentary ideas, and the business issue at hand.

If you are an end-user of the WP/WOO code base, you will soon encounter a vast panoply of confusing and seemingly interdependent software objects that will be presented to you — if you peek under the covers — in the incomprehensible and largely intimidating vestments of Woocommerce’s REST API.

The good news is that you can ignore REST completely, as it is largely irrelevant to your concerns as a harried store owner, or if you are just a data cowboy or cowgirl out to have some fun.

WP is a complex eco system, and you will first have to grasp the basics by wading in and signing up for a WP account, then perhaps downloading on your local machine a copy of WP and installing the WOO plugin, which comes with a baseline “theme” (a theme is nothing more than the visual interface to WP and your data) — unless your WP hosting does all that for you on a shared or dedicated server.

(I will not get into hosting issues here, but at a minimum try to ensure that the hosting provider will do whatever is necessary to make your store pages display almost instantly on a user’s browser; that certain housekeeping details specific to WP which are known to slow down sites are taken care of by their engineers, and not you; and that your site is configured to be as secure as possible, and not overrun by the invisible nasty bots that will consume your allocated resources, and degrade your customer’s experience.)

You will no doubt examine this theme, and perhaps take a look at some others, paying and free. (See my previous posts from 3 years ago, which I dearchived recently, in order to give you a sense of the experience of first dealing with the WP environment as a “newbie,” the latter a term I rarely use, as it’s condescending.)

If you wish to change this more to your liking, you will have to learn CSS, should you desire to further customize a theme beyond the basic knobs that most themes provide.

CSS can be tricky at first, but usage of the INSPECT -> COPY SELECTOR in Chrome or other browsers will eliminate the frustration of incorrectly targeting specific elements in your UI, and allow you to style these accordingly.

However, CSS will not influence the behavior of the underlying body of code itself.  In the separation of concerns that is at the heart of today’s Web, CSS is for how your online store looks, HTML is for how your pages are structured, after which you will come upon the submerged 9/10th of the  giant iceberg that is not visible to you as an end user, but which in reality is what makes the whole thing run.

Soon, you will discover the need to download and use numerous “plug-ins” into your WP administrator panel  — which are bits of code that extend the functionality of WP/WOO out of the box.

This at first will seem like a wonderful and apparently cheap way of getting the job done — after all, many of these plug-ins are free.

Unfortunately, after some time, you will soon learn the WP tends to change versions from time to time, which can break your e-commerce setup unexpectedly.

You will also notice that WOO itself changes frequently, which often can impact your theme as well as your plug in.  Moreover, too many plugins, and your site will become lethargic, or stop working correctly due to plugin conflicts. In addition, some plugins tend to be chatty, and constantly harass you for donations on your administration panel. They can also silently create UI objects on the various screens you use, and populate your production DB with additional data structures over which you have little control.

As a result of this rather unstable and potentially risky situation, you will begin to notice that your ecommerce business has become totally reliant on an elaborate technology stack that seemingly can only be customized by an experienced programmer.

If you decide to attempt this task yourself (very bad idea, unless you read this blog), you will be asked to produce what are called “child themes” and create bits of PHP code (or “snippets”) on the server host, which you will have to access via something like FileZilla (after long and incomprehensible discussions with your hosting service regarding IP whitelisting, following by various battles with your firewall implementation).

Much of this at first will seem beyond the ken of the average store owner, who might then opt to pay a developer to customize said code (EXTREMELY EXPENSIVE, unless you visit Twitter, which appears to be awash with ads touting lower-cost approaches) .

Notice that you will now depend on this code developer, who may suddenly decide to decamp for Lisbon next week in order to become a hip digital nomad (leaving you with a pile of undocumented code),  as well the various faceless plug in developers (who will often stop maintaining their plug-ins after some duration), not to mention various WOO and WP jolly developers, typically represented by a thought leader with blue hair (or his various lieutenants, known as “happiness engineers”) who will appear from time to time at various conferences or WP camps and pump CMS sunshine over the land.

We shall refer to all this here as M — as in the malebolge, Dante’s 8th circle of hell; please note that I am only using this term in jest.

M is a stack of code that relies essentially on a huge body of PHP (a weak-typed — by which in this context I mean largely unopinionated — popular computer language, that is heavily used by FB, WP and their ilk)  that sits on top of a “relational” database, which contains all the valuable data for your site, without which your online business would go down the toilet in an instant.

It is PHP that makes calls to this db in order to select, display and sometimes modify your shop’s underlying data.  This is the same data that you have laboriously created over a period of months or years, that is now trapped in M, aptly named because it is initially challenging to attempt to do anything with this data, once you step outside of the closed WP / WOO environment — despite all protestations as to its open sourceness notwithstanding.

Let’s say, as an example, that you wanted to look at your store products in some novel but useful and interesting way — beyond, say, the rudimentary and laborious interface provided by M. I am talking here about looking at your data as a store owner, that is to say, someone who is keen in gaining a better understanding of his or her business, possibly on a dynamic basis, with the capability to slice and dice various scenarios that you will use to model how your business is performing, and how to optimize, say, your price/inventory structures on a managed, point-and click basis.

And let us say, too, that in order to have this interesting and useful functionality, you wish to access your data directly, and manipulate it in standard ways that are available using, say, the widely-understood relational model.

Unfortunately, you will soon realize that this is not possible.

The reasons for this are quite complex, but I will simplify things here by playing Captain Obvious for a minute and declaring that M is relational but it is also other things  in fact, once you examine the underlying data and code structures, which we shall do here and in future posts, you will soon realize that there are interesting non relational facts about M.

In order to allow for a universe of apps that the designer of WP’s core DB had no idea would be, flexibility was introduced in to the schema. such as to allow a WP developer to extend the underlying schema without breaking it. This is all well and good, but makes it rather difficult for anyone who is not an experienced data modeler to understand.

Moreover, as a store owner, you shouldn’t have to deal with M’s implementation details.

You might have additional concerns.

For example,  you might begin to worry that your data — the lifeblood of your business — is now somehow trapped in M, and cannot get out, without your going through a long term exercise of learning PHP and AJAX and JSON and many other underlying knicks and twaddles that no end-user or typical store owner will have the time, patience, or aptitude to learn.

Relax.

There is another approach, as I will soon demonstrate.

My intention will be to show how you can actually download this data and perform, say,  EOQ inventory analysis, with tools such as Microsoft’s Power BI Desktop.

So the good news is that your data is not really trapped in a baroque silo that I am jokingly here referring to as M, data that you might think you cannot easily get at or modify directly, without ruining your estore.

Some things to keep in mind.

Much like its WP core underlying scaffolding, WOO is quasi relational.

Much of the RI of the WOO relational schema is enforced through PHP code, although a major/minor foreign key relationship is specified between two significant tables that will be of interest to us here.

Now keep in mind that the data structures that support your business are inherently tabular.

Despite the fact that many UI designers usually turn up their noses at the very mention of “pedestrian” HTML tables, the reality is that often the most efficient baseline representation of your retail business data is rows and columns (and associated charts and graphs).

In fact, the underlying semantics of retail db domains are often taught in introductory or advanced books (see Date, et al.)  on relational theory:

These consist of the usual

PRODUCT
CUSTOMER
ORDER
INVENTORY …

which are logical entities, or relations in relational database theory (hence its name), that are inter-related (ideally in Boyce-Code 3NF), and whose structures when implemented result in an abstraction that sits atop some computer file system.

It is this abstraction — in effect, a model or way of representing some well-understood world, such as your business — that enables an end-user to express logically consistent queries regarding the problem domain (such as, how many widgets did I sell last week?) via SQL, or perhaps perform more intricate analysis through spreadsheet macros, once the data has been downloaded to a DSS environment.

But you will initially search long and in vain, under the hood in M, for a PRODUCT table that you can work with in WOO.

Alas, it does not exist.

Instead, what you will encounter is an abstraction that is captured largely, but not exclusively, in the crucial wp-posts and wp-postmeta tables. IMHO the latter is a something of a misnomer, as what is meant in M by meta is in fact what are usually referred to as attributes, or table columns, if you will. In M, these are modeled as ancillary arrays in the core WP schema. (If you don’t get this, don’t worry: we will soon delve into this subject in exhaustive detail, including how to write SQL code that retrieves product level information.)

Moreover, you will soon discover that your data (such as information about, say, your products) can be stored in these data buckets in what is called serialized form, further entrapping your data in an incomprehensible and not easily migrated format, which, by the way, cannot by read by SQL.

So what we must additionally do, if we are interested in liberating the information in these data buckets, is find a way to unpack this encoded information and replace it with data representations that are independent of any given computer language.

This seems like a good place to stop before going over the dreaded TLDR; precipice, which in fact I may have crossed some time ago.

In the next post, we will indulge ourselves in few additional throat-clearing conceptual remarks, before taking a much closer and extremely detailed look at how M actually represents Products.