☆ 4.7 Article

Extracting automata from recurrent neural networks using queries and counterexamples (extended version)

MACHINE LEARNING (2022)

Journal

MACHINE LEARNING

Volume -, Issue -, Pages -

Publisher

SPRINGER

DOI: 10.1007/s10994-022-06163-2

Keywords

Recurrent neural networks; Automata; Deterministic finite automata; Exact learning; Extraction

Funding

European Research Council (ERC) under the European Union [802774]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The paper introduces a new algorithm for extracting DFA from trained RNNs, avoiding state explosion even in cases of large state vectors and fine differentiation. The authors discuss the relevance of applying this technique to language models and experimentally demonstrate limitations of RNN learning in certain cases.

We consider the problem of extracting a deterministic finite automaton (DFA) from a trained recurrent neural network (RNN). We present a novel algorithm that uses exact learning and abstract interpretation to perform efficient extraction of a minimal DFA describing the state dynamics of a given RNN. We use Angluin's L* algorithm as a learner and the given RNN as an oracle, refining the abstraction of the RNN only as much as necessary for answering equivalence queries. Our technique allows DFA-extraction from the RNN while avoiding state explosion, even when the state vectors are large and fine differentiation is required between RNN states. We experiment on multi-layer GRUs and LSTMs with state-vector dimensions, alphabet sizes, and underlying DFA which are significantly larger than in previous DFA-extraction work. Aditionally, we discuss when it may be relevant to apply the technique to RNNs trained as language models rather than binary classifiers, and present experiments on some such examples. In some of our experiments, the underlying target language can be described with a succinct DFA, yet we find that the extracted DFA is large and complex. These are cases in which the RNN has failed to learn the intended generalisation, and our extraction procedure highlights words which are misclassified by the seemingly perfect RNN.

Extracting automata from recurrent neural networks using queries and counterexamples (extended version)

Journal

MACHINE LEARNING

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Extracting automata from recurrent neural networks using queries and counterexamples (extended version)

Journal

MACHINE LEARNING

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper