Retrieving Layer Activations from bertDocumentClassifier (Text Analytics Tooblx)

Question

tsvi lev on 23 Mar 2024

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/2097911-retrieving-layer-activations-from-bertdocumentclassifier-text-analytics-tooblx

Answered: tsvi lev on 30 Mar 2024 at 20:37

Accepted Answer: Malay Agarwal

Hi,

started using the text analytics toolbox, and successfully trained a bertDocmentClassifier network on my dataset.

In the past I've used the 'activations' function successfully to extract layer activations from dlNetworks.

However, for a bertDocmentClassifier, I cannot get the activations function to work, as it is not like e.g. image DL network objects - it has a tokenizer first.

So for example out=activations(bertTrained,textstring,layername) does not work

I tried to apply the tokenizer first, as in e.g.:

[a,b]=encode(mdl.Tokenizer,textDataTrain(1,:))

and that gives the token codes and segments fine in a,b.

But how do i "feed" those to the dlNetwork itself from the bertDocumentClassifer object?

This for example does NOT work:

net=mdl.Network;

activations(net,a,b,'out_fc2')

and variations fail as well.

So to sum up - I have a trained BERT classifier object, I can use it to classify just fine, but I can't get the network's layer activations.

Thanks!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Malay Agarwal on 28 Mar 2024 at 12:37

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/2097911-retrieving-layer-activations-from-bertdocumentclassifier-text-analytics-tooblx#answer_1432876

Edited: Malay Agarwal on 29 Mar 2024 at 20:51

Open in MATLAB Online

Hi Tsvi,

I understand that you want to retrieve layer activations from a trained “bertDocumentClassifier” model.

There are two reasons why the “activations” function is not working as expected.

First, the “InputNames” property of the underlying network for the model shows that the model accepts three inputs instead of two. Namely, it expects the input IDs, an attention mask, and the segment IDs.

In your code, you are calling the function with only two inputs, the input IDs and the segment IDs.

Second, the “activations” function only works with networks represented as “DAGNetwork” objects or “SeriesNetwork” objects, as specified in the documentation: https://www.mathworks.com/help/releases/R2023b/deeplearning/ref/seriesnetwork.activations.html#d126e5157.

The underlying network for “bertDocumentClassifier” is a “dlnetwork” object: https://www.mathworks.com/help/releases/R2023b/textanalytics/ref/bertdocumentclassifier.html#mw_2480ef12-2a75-480d-aec3-eefb236d8afe. For such objects, you need to use the “predict” or the “forward” function, based on whether you want the model to output for inference or for training.

Please try the following code. I am assuming you want the model to output for inference and hence, using the “predict” function. If you want the model to output for training, change the “predict” call to a “forward” call. No other changes will be required:

% Extract the network
net = mdl.Network;
% Extract an example and encode it
example = textDataTrain(1, :);
[tokens, segments] = encode(mdl.Tokenizer, example);
% Since tokens and segments is a cell arrays with single vectors
% Extract the vectors
tokens = tokens{1};
segments = segments{1};
% Extract number of tokens
dims = size(tokens, 2);
% Convert the tokens and segments to dlarray
% BERT expects input in CTB format
tokens = dlarray(tokens, "CTB");
segments = dlarray(segments, "CTB");
% Create an attention mask of all zeros in CTB format
attentionMask = dlarray(zeros(1, dims), "CTB");
% Use predict function to get the output of layer 'out_fc2'
output = predict(net, tokens, attentionMask, segments, 'Outputs', 'out_fc2');

The code:

Extracts the underlying network for the model.
Extracts a single example from the training data and encodes it into input IDs and segment IDs.
Since the outputs of the “encode” function are cell arrays (https://www.mathworks.com/help/releases/R2023b/textanalytics/ref/berttokenizer.encode.html#mw_3665f38f-7b0f-4514-a774-9578f8f519b4), extracts the vectors in those cell arrays.
Since the ”forward” function expect inputs as “dlarray” objects (https://www.mathworks.com/help/releases/R2023b/deeplearning/ref/dlnetwork.forward.html#function_forward_sep_mw_bfae26e4-e62e-4b46-80e2-3d3f8305b520) and the “predict” function can work with them as well, converts the input IDs and the segment IDs into “dlarray” objects. Note that the “bertDocumentClassifier” expects input in the “CTB” data format.
Creates an attention mask of all zeros as a “dlarray” object, in the “CTB” data format.
Uses the “predict” function with the “Outputs” name-value argument to get the activations from the layer ‘out_fc2’.