Update drift detector states and drift status with new data
Perform Drift Detection on Data Stream
Create a random stream such that the first 1000 observations come from a normal distribution with mean 2 and standard deviation 0.75 and the next 1000 come from a normal distribution with mean 4 and standard deviation 1. In an incremental drift detection application, access to data stream and model update would happen consecutively. One would not collect the data first and then feed into the model. However, for the purpose of clarification, this example demonstrates the simulation of data separately.
rng(1234) % For reproducibility numObservations = 3000; switchPeriod1 = 1000; X = zeros([numObservations 1]); for i = 1:numObservations if i <= switchPeriod1 X(i) = normrnd(2,0.75); else X(i) = normrnd(4,1); end end
Initiate the incremental concept drift detector. Utilize the Hoeffding's bound method with exponential moving average method (EWMA). Specify the input type as continuous, a warmup of 50 observations, and an estimation period of 50 observations.
incCDDetector = incrementalConceptDriftDetector("HDDMW",InputType="continuous", ... WarmupPeriod=50,EstimationPeriod=50)
incCDDetector = HoeffdingDriftDetectionMethod PreviousDriftStatus: 'Stable' DriftStatus: 'Stable' IsWarm: 0 NumTrainingObservations: 0 Alternative: 'greater' InputType: 'continuous' TestMethod: 'ewma' Properties, Methods
incDDetector is a
HoeffdingDriftDetectionMethod object. When you first create the object, properties such as
NumTrainingObservations are at their initial state.
detectdrift updates them as you feed the data incrementally and monitor for drift.
Preallocate the batch size and the variables to record drift status and statistics.
status = zeros([numObservations 1]); statusname = strings([numObservations 1]);
Simulate the data stream of one observation at a time and perform incremental drift detection. At each iteration:
Monitor for drift using the new data with
Track and record the drift status for visualization purposes.
When a drift is detected, reset the incremental concept drift detector by using the function
for i = 1:numObservations incCDDetector = detectdrift(incCDDetector,X(i)); if incCDDetector.DriftDetected status(i) = 2; statusname(i) = string(incCDDetector.DriftStatus); incCDDetector = reset(incCDDetector); % If drift detected, reset the detector sprintf("Drift detected at observation #%d. Detector reset.",i) elseif incCDDetector.WarningDetected status(i) = 1; statusname(i) = string(incCDDetector.DriftStatus); sprintf("Warning detected at observation #%d.",i) else status(i) = 0; statusname(i) = string(incCDDetector.DriftStatus); end end
ans = "Warning detected at observation #1019."
ans = "Warning detected at observation #1020."
ans = "Warning detected at observation #1021."
ans = "Warning detected at observation #1022."
ans = "Drift detected at observation #1023. Detector reset."
Plot the drift status versus the data observation number.
gscatter(1:numObservations,status,statusname,'gyr','*',5,'on',"Number of observations","Drift status")
IncCDDetector — Incremental concept drift detector
X — Input data
n-by-1 vector of real numbers | logical vector | vector of 0s and 1s
Input data, specified as an n-by-1 vector of real numbers, where n is the number of observations.
InputTypevalue in the call to
Xmust be a vector of real-valued numbers.
InputTypevalue in the call to
Xcan be a logical vector or vector of 0s and 1s.
W — Observation weights
n-by-1 vector of real numbers
Observation weights, specified as an n-by-1 vector of real
numbers, where n is the number of observations.
W must have the same number of elements as
You cannot use the
Weights argument for the Hoeffding's Bounds
Drift Detection Method using exponentially weighted moving averages (HDDMW). To use
observation weights, specify
DetectionMethod in the call to