What should go in a next-generation MATLAB X?
Andrew Janke
on 11 Sep 2021
Let's say MathWorks decides to create a MATLAB X release, which takes a big one-time breaking change that abandons back-compatibility and creates a more modern MATLAB language, ditching the unfortunate stuff that's around for historical reasons. What would you like to see in it?
I'm thinking stuff like syntax and semantics tweaks, changes to function behavior and interfaces in the standard library and Toolboxes, and so on.
(The "X" is for major version 10, like in "OS X". Matlab is still on version 9.x even though we use "R20xxa" release names now.)
What should you post where?
Next Gen threads (#1): features that would break compatibility with previous versions, but would be nice to have
@anyone posting a new thread when the last one gets too large (about 50 answers seems a reasonable limit per thread), please update this list in all last threads. (if you don't have editing privileges, just post a comment asking someone to do the edit)
371 Comments
Time DescendingA native <xarray> -- in which an array's dimensions (row/col/page/...) can be named, and even more conveniently, assigned coordinate variables. Summarizing functions, arithmetic, interpolation, etc. can then act based on coordinate names and values. I more and more find myself dealing with highly dimensional data. I made my own <xarray> implementation, but I am sure MathWorks could make one that is much more performant and convenient...
In a professional setting I've found MATLAB to be the most effective tool for engineering visualization problems. We use it for rapidly building tools that integrate our telemetry with MATLAB's flexible plotting capabilities, and create GUIs that allow us to peel back the onion of some complex datasets. Some examples are overlaying sensor telemetry on recorded images, or analyzing kalman filter performance at scale. These tools are highly interactive and allow for user interactions by design (e.g. clicking on objects to interact, custom click-and-drag, callbacks, keybinds, etc).
The ability to create visualization tools like this paired with the extensive math and specialized toolbox capabilities of MATLAB is a technology differentiator that I don't think MathWorks leans into enough. I say this because renderer performance is often the bottleneck to our tools, and performance does not seem to be a primary focus of the MathWorks development team (I do see and appreciate performance updates in the release notes, but I wouldn't call it the North Star). On our team we have to use a lot of low-level tricks to make things feel reasonably performant (e.g. hgtransforms, NaN-breaks to minimize number of objects plotted, minimizing cla() calls, etc), and even then it's not what I would consider good. Some examples of issues that come up fairly often:
- renderer performance gets significantly worse as a function of the figure/axes size on the monitor
- text() objects scale terribly and cause the axes to become very slow
- modern axes objects use "linger" mechanics that bog down performance (https://undocumentedmatlab.com/articles/improving-graphics-interactivity)
- patch and surface objects can become quite slow when interacting with them, particularly with a maximized figure
- uifigure performance is so bad (and worse on Linux vs. Windows) that we do not use it for anything except for the occasional geoglobe() plot
In my pie-in-the-sky MATLAB dream world, all figures would perform near to the Unreal Engine 5 renderer, where I can draw basically unlimited shapes, surfaces, geometry, lighting etc and it's all dispatched to some GPU-based renderer that always feels snappy and interactive. I know that's not realistic, but it's the general direction that I'd like to see MathWorks steering the product. My point is to please, invest heavily in your visualization infrastructure! It is one of MATLAB's key technology differentiators as Python gobbles up market share left and right. I get frustrated because updates to MATLAB often come at the cost of performance. I would happily leave all the gloss and polish on the table if it meant I could visualize more complex datasets or run my code faster. For instance the updates I have been most happy with in the past few years have been the "page" functions, such as "pagemtimes". These are functions I use all the time to process data faster and/or at larger scale.
The first change I would make would be to scrap the special treatment of Nx1 and 1xN matrices. These are given special status (as "column vectors" and "row vectors"), which must, I suppose, be helpful sometimes, but in practice it's confusing (that is, it confuses me) and makes general code much more complex than it should be.
For example, if you write c = a(b) where the values of all the variables are numeric arrays, the rule is that c will be the same shape as b except when a or b is a column vector and the other is a row vector. An exception to a general rule is, as a general rule, a bad thing. One that affects as fundamental an operation as indexing an array is a very bad thing.
Another exception: size truncates trailing 1s except in the case of column vectors, and ndims returns 2 for column vectors. General code therefore has to handle this case specially. For an example of code that could be simpler without these complexities see exindex.
It makes for messy code in other ways: my arguments blocks are peppered with (1,1) to indicate scalars, when (1) would be easier to read and should be sufficient.
It's not as if row vector and column vectors are always treated the same as each other. Matrix multiplication distinguishes between them of course, as does the loop construct for. Making them a special category, when they're actually just different shapes of arrays, simply adds complexity.
Can anyone make a case for keeping this peculiarity?
Some simple things would be nice:
- counter += 1; salery *= 2 % operator assignment, or whatever it is called
- y = (x < 0) ? 3 : 2*x; % ternary operator
Insert 'parfor' option into splitapply( ), grouptransform( ) or create separate parallel versions of those two functions.
Right now the groupbased functions run through groups with for-loop. It's very slow for data with large number of groups. When the said data set was run through with parfor-loop, it was 5 to to 10 times faster.
functional programming hiding looping details makes the coding process closer human cognition. And parfor is a really powerful beast. The combination of these two infowar-horses will make Matlab take a decise lead ahead of sluggish reptile.
My wish list:
- A real, beautiful dark theme
- Improving the appearance of figures. Reduce padding around subplots, set default axis and tick mark color to black, adjust default linewidth and font sizes to be a bit larger. In general, try to make figures made quickly with default settings look better.
- Multi-start options for all solvers in the optimization/curve fit toolbox.
- Consistent arguments for plotting functions. I think some still use different capitalization schemes (like "LineWidth" vs "linewidth").
The current thread is fairly close to my arbitrary suggested limit of 50 answers. If you think it makes more sense to start a new thread, go ahead. I'm happy to start a new one, but you can also do it and add it to the list of threads (don't forget to edit the other threads as well).
In an attempt to discourage new answers (while waiting for the ability to soft-lock threads), I have started editing the older questions by putting '[DISCONTINUED]' at the start of the question.
This wonderful thread is becoming unwieldy and slow to respond to editing on both laptop and desktop for me. If others are having the same problem, perhaps this Question should be locked, at least for new Answers (is that possible?), and a new Question opened for new Answers?
My wish list, not about code improvement but about official tutorials.
- a tutorial of using splitapply to take advantage of parallel computation.
- a tutorial of assignment and indexing involving comma-seprated list, cell array. It not only shows what works, but also explains what syntax would go wrong, and why it go wrong.
For example, x = ["a", "b"] is a 1x2 string array. But then x(:) becomes a column vector, then x{:} is a comma-seprated list; then [x{:}] is a character vector 'ab'. Such 'delicate' usage is the biggest bottleneck for my coding process. @Stephen23 has written a tutorial of comma-separated list. I hope Mathworks staff can take from there to expand it, covering the use cases of table. For example, if T is a table. T(1,:) is a single-row table. But then T{1,:} sometimes works if variables' data type can be lumped together; sometimes fails if variables have mixed data types. But then when it works, say, all table variables are 'string'. Why then T{1,:} is a string array, intead of a comma-separated list? Two similar syntaxes, x{:} and T{1,:}, have two different semantic meaning. That really causes workflow jam in my coding.
Sign in to participate