Grid Fault Location Detection Using System Simulation and Machine Learning
Patrice Brunelle, Hydro-Quebec
MathWorks and Hydro-Québec explore how both system simulation and machine learning can be used to develop algorithms that can detect the location of faults on electric grids using voltage sag measurements. System simulation is used to generate synthesized fault data that covers a broader operating envelope than measured data alone. The synthesized data is then used to train machine-learning classification algorithms. You’ll learn how the performance of classification algorithms may be used to provide further insight into the physical behavior of the system and any limitations associated with training data. You’ll also see how recommendations can be made from this insight to enhance system measurements and training data sets to improve overall classification accuracy.
Hello, everyone. My name is Graham Dudgeon, and I am principal product manager for electrical technology at MathWorks. I'm joined today by Patrice Brunelle, who is principal scientist at Hydro-Quebec's research institute, IREQ. Hi, Patrice. How are you, my friend?
I'm doing great. Thanks, Graham. I'm really happy to talk with you today.
In this presentation, Patrice and I will be talking about how both system simulation and machine learning can be used to develop algorithms that can detect the location of faults on electric grids using voltage sag measurements. Patrice will kick us off by providing some background to fault location detection, and we'll discuss some of Hydro-Quebec's initiatives in this area. Patrice will then describe the system under study and talk about configuring the simulation model to generate fault data at multiple locations.
Patrice trees will then pass back to me, and I will discuss the use of Classification Learner, which is part of the statistics and machine learning toolbox to train and evaluate machine learning algorithms. Next, I'll take a look at the results of a classification algorithm can help guide us to make recommendations on what additional measurements may be needed to improve overall results. I will then explore how reduced data sets affect the accuracy of classification algorithms. This helps provide guidance on how much data is needed to provide accurate classification and what limitations may exist if we are training with reduced data sets. We'll then end with a summary.
I'd like to start by setting some context behind fault location detection. Clearly, being able to precisely determine the location of a fault is of a high operational value. With the precise location, system operators can take definitive action and maintenance crew can be more efficiently dispatched. Hydro-Quebec has a long history of developing advanced faults location and condition-based maintenance capabilities.
One example of this is the MILES project. MILES, standing for maintenance and investigation on lines. With MILES, voltage measurements were made at key location, and an logarithm was developed that would triangulate on a fault location based only on these measurements. The image we see on the right shows an example of the MILES fault locator, estimating an actual fault.
The MILES algorithm are based on the power system engineering theory. And Hydro-Quebec, like many utilities around the world, is exploring where machine learning may provide complementary capability and enhancement for operational monitoring. Which is why we were excited to explore this capability with MathWorks on a representative problem.
The system under investigation is a radial distribution network, which is representative of the system used for the MILES project. The simulation has been validated against a real system. And so we have a high level of confidence that the simulated fault responses will be representative of actual response.
For this study, there are five voltage measurement locations indicated in green here. And we chose 38 different locations in which to apply faults. For each volt location, we use a 288 combination of phase and neutral resistance. This was done to generate over more than 10,000 fault scenarios, specifically 38 fault locations with 288 scenarios per location-- comes out at more than 10,000 scenarios.
Why did we aim for such a large number of scenarios? Well, for machine learning, typically the more data, the better. 10,000 seems like a reasonable target. Though we can, of course, generate more if needed.
I should also note that we generated normal data, meaning data from simulation where no fault was applied. And we changed only load value, using a normal distribution profile on each load. I'll now switch to MATLAB to show you the model and the script we are using to generate the simulation data.
Let's first have a look at the simulation model that we developed in Simscape Electrical's specialized power system. The distribution network is connected through a grid to a distribution transformer. And there is a three-phase line over 15 kilometers that I split in three portions of five kilometer each And also, there is a two-phase branch there.
So I label all the blocks, L1, L2, L3 and L4. There is also some single phase distribution feeder connected to the network. I have a six of them. I labeled L1 Phase 1. We also have a bus bar to measure the voltages and currents in five locations in the model. They are labeled B1, B2, B3, B4. We collect all the signals in here. And for each fold, we are measuring the positive, negative, and zero sequence of it. We collect all the signals in one output.
Let's have a quick look under the three-phases system. You see that I split it in 1 kilometer part, using this block. And this is where we specified the line parameters. And this allows me to have access to five or six points in the line. So I'll be able to put a fault in there.
The purpose is to have the fault block that I can program for various fault types and fault impedance. So I'll be able to set up this block, and then the script will move this along the line in all the locations we just saw before. So it will allow me to apply the fault at the 38 locations I mentioned earlier. And, m course, the same applies also for the single phase line.
Now, let's have a look at the script we use to generate the simulation data. Let me go full screen. Here we go.
This is where I specified generic general parameters, where I define the type of fault. For now, I'll program only an AB to ground fault with the dedicated parameters. Here, I'm listing all of the 10 lines I have in my models, where I will apply the fault using special label that we could use later on.
Here, this is where we can specify the lines. So now for this simulation, let's say that we will only 1 and five. So the single-phase line and the two-phase line, just to show you the principle. After that, this is where for each line in the list, we'll add the full block to it. And I will do some sittings through it, depending if we are doing a three-phase forward or a single-phase fault I have to set up the block accordingly then if we go down for each section each insertion point I'll do a hard line so I'll connect the full block to the location I would like to do default and for each photo location I of error r phase and r neutral values.
So this will give me a lot of fault-- typical fault-- to this dislocation. So in the next step, let's see. Here, let's just simulate for one, just to show you the principle. We'll go faster for the simulation. And then the next step is to launch the simulation. And after the simulation the, save the simulation, the data in a table, So it will be available after the simulation.
Now, let's start. Ooops, there is a stop. Let's continue. You should see the full block appearing in this subsystem. This is the L1 line. Here we go. It is connected to the first point, where I would like to pair from the fold. The model is compiling, simulating, and then I'm going to the next section, do the same settings, applied fold, Collect the submission data, etc cetera. I'll let it run for the rest of the line.
We now start the submission number 4. And once we are done with this L1 line, you close it automatically and then open the single-phase line. Same thing here. I'm doing add block, and I apply the fault at the first position, then simulate, get the result, go to the next.
There will be two more simulations, this one and then the last one. OK. So let's now go to the MATLAB command. You can see here, the fault location and the table of data generated.
Let's have a quick overlook at the simulations results. For example, the first one I did, the first fault on the bus B1, year ID ABC phase magnitude, just to show you the sequence parameter I'm computing and getting. And seeing the last simulation I did at the very same bus.
I'll now pass back to Graham, who will discuss using the machine learning tools.
Thank you, Patrice. So once the fault data is generated, we then organize it in MATLAB table. The table include sequence data for each bus voltage measurement and also the fault classification. The example we see here shows only a few data points for illustrative purposes. We have plus one sequence data for magnitude and angle and also the fault classification.
For this example, we generated data for over 10,000 scenarios. The Classification Learner is a user interface that comes with the statistics in the machine learning toolbox. So I'm going to open up the Classification Learner in a moment, and I can show you some of its capabilities. I would note that I'm not going to give a comprehensive overview, so following what I show, if you would like more information, I would encourage you to refer to the documentation.
The first thing I'm going to do is load the data set and invoke the Classification Learner. I'm using projects here, so I can organize my files and create shortcuts to help me better manage my workflows. So I'm going to click get orig training data. What that will do is load the data and then invoke the classification learner.
If you would like more information on projects, please refer to the documentation. I will just expand the Classification Learner to the fullscreen.
In the Classification Learner, I first start a new session and load data from the workspace. In this case, my data is in the MATLAB table sheet. Now, it was the only variable in the workspace, so it automatically picked that one up. If you had multiple data set variables in your workspace, you would select. In this case, I didn't have to do that. You'll see that the data has been passed automatically and that the default column is essentially categorical, with 39 unique classifications. I will remind that our 38 faults locations and also one normal classification.
So because it's essentially categorical, The Classification Learner has automatically picked up fault as the response data. And the other variables within the data table T are chosen as predictors. Now, of course, you have control over. If classification no doesn't pick up the right information, you can select appropriately but, in this case, he knows exactly what I wanted to do.
What we now do is select what we want to do with validation. There are two options-- cross validation which separates the data into a training and testing set using statistical methods or hold-out validation, which will put aside a certain percentage of the data for testing, and then use the remaining data for trim.
We'll stick with the default setting, which is to use cross-validation with five faults. We then click Start Session. You can see that we have defaulted to a scatterplot, which in this case is showing bus 1 voltage magnitude for the positive sequence. There actually is bus 1 voltage magnitudes for the negative sequence.
There are a couple of observations I would like to make here. First, normal operation, where no fault is applied but where we are varying load values, it seemed to be very clean. It's actually this small region here down at the bottom right. If I just scroll down on our classes to normal. it's the red one. If I hover over. Then we'll actually get some information on the data points that are selected. So you can see here, class normal.
So we can see that normal behavior is very clean, in that we have a tight distribution and we do not see any overlap with any of the fault, conditions. We would expect normal operation to be readily classified in this case. The second observation is that while we can see a pattern on the fault data, we also see overlap of data points, meaning that classification through traditional engineering analysis would be challenging.
Let us now present this data to machine learning algorithms and see what we can achieve. The place I always start is to select all quick to train. What this will do is it will select a number of machine learning algorithms, which, for the data say I'm presenting, will train in a relatively quick amount of time. If I then select train, this will automatically invoke a parallel pool if you have Parallel Computing Toolbox installed, which will allow the training algorithms to benefit from multiple cores.
And we can see now we have a number of different algorithms going through the training process. So we'll just let a few of those go through. As you can see, as they are finishing, then the accuracy is coming up. And the best model is going to be highlighted by the white box. So right now we have an accuracy of 67.9%. We will just let this run a little bit more, se we can do better. The fine kNN is it 75.9%, 80.6 on the medium kNN. So that's pretty good. So what we can do, while the other ones are just looking to finish training it's our best one so far.
So what does 80.6 percent mean? To gain more insight on this number, we can view the confusion matrix. So we go here and select confusion matrix. The confusion matrix shows us how the training data is performing on the trained classifier. We see that we have true class versus predictive class. If we had a perfect classifier, we would see only diagonal entries on this matrix, and we would also be a little bit skeptical of the results.
Perfect classification of training data, naming your overhead the classification algorithm or that you have some data quality issues. In this case, we can see that we have some areas where we have a distinct off-diagonal pattern, where the classification, looking at the numbers associated with the training sets, we can see that we've got a large number associated here with L1 PH3 and L1 PH2, and also here with L1 PH5 and L1 PH5. So the classifier is struggling with distinguishing the faults on lines L1 PH2 and L1 PH3, and is also struggling with classifying defaults on L1 PH4 and L1 PH5.
This issue forces us to go back to the physical system and determine whether there are physical characteristics that are contributing to this result. So let us consider what's happening with our system. The voltage measurements we are taking all upstream from forked lines. The fork lines contain equivalent electrical characteristics.
This means that if a fault occurs on a fork, say, at location F1, in this illustrative example. Then the voltage measurement indicated by demons, while it can detect the fault, it cannot distinguish whether the fault is at location. F1 or F2.
Let's look at the system model again so I can show you the forked lines. OK. So we are having the issue with L1 PH2 and L1 PH3. If I just go under L1 PH2, and see we have four segments here. And if we go to L1 PH3, we have two segments. But the lines for 11, nn, which is here. So we do indeed have a fork, which has the same electrical characteristics. And hence, this is why we are having the difficulty with the classification on the L1 PH2 and L1 PH3.
The same goes for L1 PH4 and L1 PH5. We have the same setup in this case. So we just zoom in a little bit more on those areas of the confusion matrix where we're having the difficulty. So we see we have significant off-diagonal classification, which is erroneous because of the forked lines. So what can we do to improve the situation?
One solution is to make additional voltage measurements at the end of a fork. Note that, in general, we need y minus 1 additional measurements, where y is the number of forks. And so with two forks, which is the situation we have in our system, we need only one additional measurement. So we updated the simulation model to include additional measurements, in this case a measurement on L1 PH2 to help distinguish L1 PH2 and L1 PH3 faults, and a measurement on L1 PH4 to help distinguish L1 PH4 and L1 PH5 folds.
So we'll now load up the new data set with additional measurements and retrain the classification algorithms. We are no training on the new data set. Now, remember, the last time, when we did not have these additional measurements, where the forked lines were an issue, the best result we had on the all quick to train was 80.6%. So let's just let this go through, and we'll see what we can achieve.
75% so far on the fine tree. We'll just give it a few more seconds to let one or two more train up. 91.9%. So we're already getting better response, but the proof is in the pudding. We'll have to look at the confusion matrix to see if we are helping resolve the particular issue we had.
So let me select either fine kNN or cosine kNN. They have equal accuracy. They may have slightly different results, but I'll just choose one to take a look at here. We will look at the confusion matrix.
Actually, let me try that again. We have actually three with the same results. So it now chose the weighted kNN. So I'll just select that, and we'll take a look at the confusion matrix.
So we can no see-- you may remember that we had a significantly larger off-diagonal component here when we were looking at L1 PH2 and L1 PH3. So we have significantly better classification than we had before. So the introduction of those additional measurements on L1 PH2 and L1 PH4 have helped us achieve a greater level of accuracy. And that also helps build our confidence that the effect we were seeing was indeed caused by the forked lines.
So I'll just make a couple more points here. I've only used all quick to train. But with the Classification Learner, you have access to a broader range of models. And it may well be that you want to take a look at support vector machines. I typically use quadratic support vector machines, because I find them to be more accurate. But they will take longer to train. So I'm not training one here in this presentation, because it does take a longer amount of time. But typically you would see more accuracy with that.
Another point is when you do have a trained model, you can press Export Model, and then you can select a name for your model, and just click OK. Then go to the MATLAB workspace. So you can see here that we have trained model in the workspace, and it also shows you have to call it within the MATLAB workspace.
So in subsequent results I'm going to show in this presentation I'm using the MATLAB scripts to be able to do that. I'm not going to show you the MATLAB scripts, as they're just lines of code. I'd rather focus on the results in this presentation. But we can provide scripts to those who want to take a closer look at these workflows.
We will know consider training using only edge cases. The reason we're doing this is to give us some insight on the type of data we need to successfully train a classification algorithm. Particularly, can we achieve accurate results on reduced data sets?
There are three cases we'll consider. First, training on fault data gathered only from the first line sections. Second, training on fault data gathered only on the last sections. And third, training on fault data gathered on both the first and last sections. We can see from the confusion matrices shown here that we get very accurate results for classification on the data provided. This is to be expected.
The question is, how will the classifier respond when fault data from other sections are passed through these models? We'll take a look at a couple of lines to explore what happens with this particular system. Let me first orient you on the confusion matrices you're seeing. So let's focus on the results on the right, trained on first section. This means we do not have predicted classes for anything other than section 1. Which is why, if you look at the columns here, you'll see Section 2, section 3, and section 4-- these are empty. That's to be expected, because we did not train on those.
So given this, a result on the diagonal is best, because we have data for that scenario. If we classify in the green box, that means we've identified the correct length. So, for example, we'll look at L1 PH4 section 2. that was the true class that was not trained on the first section data. It's been identified as L1 PH4 section 1. So the line is the same, so hence the green box. And that's the best we can do for sections we have not trained, with is to at least have them classified on the correct line.
Anything outside the green box means that we haven't identified the correct line. So we can see, by looking at those three different edge cases, that we do not get satisfactory results. For example, the behavior of faults on section 1 of L1 PH4 does not contain sufficient information to extrapolate that a fault on another section available L1 PH4 can be identified as belonging to that line.
Here's another example, L1 PH6. This line has only two sections, and we see that while training on the last section, the middle response here yields accurate results for identifying the correct line. Training on the first section is not accurate. So when we look at these results and also other results which I'm not showing here, we conclude that we need a broad range of false scenarios across every line section in order to accurately classify the fault locations. This perhaps comes as no surprise, but the remaining question is, what level of granularity do we need on the line sections to achieve acceptable levels of accuracy.
This question is outside of the scope of this presentation, but can certainly be explored through the generation of synthesized data from simulation models. So, in conclusion, the results of this study are encouraging. We've shown that classification machine learning algorithms can be used to classify fault locations with a relatively high degree of accuracy. We saw that forked lines are problematic for upstream measurements. And so, in this case, we recommend additional measurements at the end of a fork.
By making these additional measurements, we are able to achieve much better accuracy on fault location classification. We also took a look at treating on reduced data sets. And we found, in this example, training only on the first and last sections is insufficient to locate the correct line with an acceptable degree of accuracy. What that means is a broad range of synthesized data is necessary to effectively train machine learning algorithms.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.