Gabriel Ha, MathWorks
The Database Explorer App bridges the processes of accessing your relational/SQL data and analyzing it in MATLAB®. Its visual interface allows you to quickly pull data into a MATLAB friendly table format without needing to know the Microsoft® SQL programming language. You can then easily perform your subsequent analytics utilizing other MATLAB tools and functions, and auto-generate code to retrieve database results so others can reproduce your work.
See more information on Database Explorer App.
Database Toolbox enables you to apply your MATLAB knowledge to data stored in databases, even if you’re not familiar with the database language. The Database Explorer App embodies this value of Database Toolbox, and this video will give you a quick yet thorough overview of the app, which allows you to interact with relational data in a visual manner. Even if you’re not familiar with SQL, you’ll be able to bring that data into MATLAB for further analysis.
First things first: Where is it? Just go to the Apps tab, and if you're confronted with a flurry of different apps, just search for it with the word “database.”
This demo features a MySQL database, and we’ll first need to set up a connection. Select this from the Connect menu, and then fill in the appropriate details. In particular, you’ll need to point MATLAB to the location of the database driver, which is usually a jar file for JDBC connections. If you don't know these parameters offhand, simply get in touch with your IT contact to provide the details.
We’ll test the connection providing our username and super-secret password. And since the test was successful, we are good to go.
Let’s officially connect to our database, which we now find in our Connect menu. We’ll supply the credentials, and in our case, the catalog is the toy_store. And with that, we are all set to explore!
The panel to the left lists out all the tables found in our database, and if we select one, we’ll see a preview of the data, along with the SQL query that would generate that result; pretty neat learning tool if you want to learn some SQL. We can expand the table and see the list of columns for a more complete view of our database, and we can select and deselect columns based on what we want to see. We can also change the number of items displayed in the Data Preview.
Let’s look at some basic functionality at our disposal. The results from the inventory table are currently sorted by product number, but we might want to sort it by something else, like price. Simply click on “order by,” select the Price column, add this, and now we can easily view our results using ascending or descending price.
So I’m looking at this data, and I'm thinking to myself, “Man, it’d really be nice to be able to know what these products actually are, because product number is kind of useless from a description standpoint.” If I switch over to the product table, I note that there is a productDescription column, which seems to be what I want. But in this current view I can't select the information with the inventory table.
To do this, we need to join the tables, which is a very common SQL operation. We simply select our first table and then click on Join. We’ll select the table to be joined, and then select the column that exists in both tables; otherwise our results won’t make much sense. By default, this will perform an inner join, which is usually what you want as it gives the intersection of the tables. You’ll also notice the diagram in the lower right-hand corner, which gives you a visual aid of what’s happening. And now that we’ve done this, we can view and select the contents of both tables in whatever manner helps augment our understanding of the data.
So at this point, let's say my boss comes in and says, “I need you to tell me our total sales for February.” There is a February column in the salesVolume table, so that’s likely what I’m looking for. Since I also have price information, I can just multiply them together and sum each result for my final answer.
But remember that I can't get that complete view of everything because we haven’t yet joined the tables; we’ll need to do an additional join on our already joined results. We notice that the sales volume table doesn't have a product number column, but it does have a stock number column, which, if you’ll note, is also contained in the product table.
So let’s perform another join on this shared column. Note the diagram has updated, showing us what’s happening, and now when we go back, we have full access to the sales volume table in context of the other two. We’ll go ahead and add some columns to our query, and now we have the information all in one place to get total sales for February, or for any month.
There’s one hitch: We have some missing data, indicated by this NaN. And regardless of our policy to deal with missing data, it’s good to ensure that missing data is treated in a consistent manner.
That’s where Import Options come into play. As you can see, data is by default imported as NaN for anything of type double, but we can change specific columns to import 0 instead. As another example, note that inventoryDate is currently being imported as a char, but it would be more meaningful to specify it as a dateTime, which comes with its own menu where we can format dateTimes exactly as we’d like.
Now that our data is cleaned up, we’re ready to perform the analysis for my boss. If you’re a SQL expert, you can actually perform all the calculations entirely within SQL, but it’ll probably be easier if we just leverage our existing MATLAB knowledge. Database Explorer lets us import the data into MATLAB as a table. We’ll first clean up our selected columns, export the data, and take a look.
All we have to do is take the dot product of the price column and the February column, which gives us total sales for each item, and the sum of all of them gives us our answer.
Now, I could go ahead and email the results to my boss, or I could do him one better and give him total sales for every month, or even the entire year, because it really wasn’t that difficult.
One trick you can do in MATLAB is extract multiple columns from a table into a matrix, which lets you perform matrix operations. In this case, we need column 1: price, and columns 5 through 16 for the corresponding sales volume. We’ll extract the price data as a vector, and then all the sales data as a matrix. The first colon in each line of code indicates that I want all the rows, and the next parameter indicates my columns of interest. Note that I’m using brace notation to get the data as an array, or matrix, and not a table.
Now we can simply use the dot star operator to apply a dot product from the price array onto each month. Taking the sum of that matrix gives us monthly sales. Take the sum of that matrix, and we’ve got sales for the entire year, easy-peasy, and hopefully I can email this to my boss and make him happy.
So given that this is a Database Explorer App demo, why bother going through all this MATLAB code? Well, aside from maybe showing you some new coding tricks, I want to emphasize that there’s quite a bit of analysis that you might apply to your data. You’ve got this great setup, but you might have forgotten how you retrieved all that data. It’s important for your coworkers, as well as your future self, to be able to replicate your results.
Database Explorer can generate a MATLAB script with one click, and that script will do everything we just did to get data out of our database.
One very important thing to note is that the app does not save the password that you used to connect to your database, for obvious security reasons. So anyone using this code will need to supply the proper credentials into this line of code, and if you watching this video are any sort of decent human being, you'll look away from your computer screen right now so that you won’t see that my super-secret password is “matlab.” Scroll down and… okay, you can look again.
To make this script complete, let’s add in our MATLAB work. We’ll create a new section for clarity, paste in our command-line work, and now any prospective coworker and our future selves can easily replicate our results.
So what we’ve demonstrated is the essential Database Explorer workflow: Visually interact with your data, bring it into MATLAB, perform your analysis, and make it reproducible.
Here are a few more Database Explorer features you might find useful:
You can generate the SQL query used to retrieve your results.
You can filter database items using the WHERE statement. For example, let’s say that I just want to see all the inexpensive items, like anything that’s priced below 10 bucks. Click on Where, we’ll say that we want price… less than… whatever you want, actually. You can even compare it to its corresponding value in another column if you need to be clever, but we’ll go with 10, and now we see those items.
And these options are by no means mutually exclusive. You can join tables ordered by one column, where another column is greater than 20, no duplicates… you can do that in SQL.
Last, but not least, if you’re familiar with SQL, you can directly edit the SQL query. This gives you the full benefits of language which may not be directly enabled by our buttons, while still being able auto-generate SQL through interacting with the app.
When we click on “Manual,” it retains any SQL code that we’ve been indirectly generating, and now you can type in whatever you want. Note that this opened a new tab in Database Explorer, which serves as a distinct working environment. You can actually have multiple sessions into the same database or other databases; just use the “New Query” button, and you can still retain access to your previous work.
Use Database Explorer to easily access and analyze your relational data, even if you don’t know SQL, leveraging it if you do. Thanks for watching, click on the links in the descriptions for even more info, and don't hesitate to leave us feedback.