Skip to content

feat: proof of Myhill-Nerode theorem for DFAs#491

Open
akhilesh-balaji wants to merge 15 commits intoleanprover:mainfrom
akhilesh-balaji:myhill-nerode
Open

feat: proof of Myhill-Nerode theorem for DFAs#491
akhilesh-balaji wants to merge 15 commits intoleanprover:mainfrom
akhilesh-balaji:myhill-nerode

Conversation

@akhilesh-balaji
Copy link
Copy Markdown

@akhilesh-balaji akhilesh-balaji commented Apr 14, 2026

Re-opening #479 which was accidentally closed. Thanks very much, @ctchou, for going through it and making edits.

This PR adds a proof of the Myhill-Nerode theorem. It builds on the definition of right congruence (and associated definitions and theorems) in #278. These are the main results shown (the statement of the theorem has been taken from Wikipedia and amended slightly):

(1) L regular iff. ∼_L (distinguishability by L, also called the Nerode
congruence) has a finite number of equivalence classes N.
(2) N is the number of states in the minimal DFA accepting L.
(3) The minimal DFA is unique up to unique isomorphism. That is, for any
minimal DFA acceptor, there exists exactly one isomorphism from it to the
following one:

Let each equivalence class ⟦ x ⟧ correspond to a state, and let state
transitions be a : ⟦ x ⟧ → ⟦ x a ⟧ for each a ∈ Σ.
Let the starting state be ⟦ ϵ ⟧, and the accepting states be ⟦ x ⟧ where
x ∈ L.

Copy link
Copy Markdown
Collaborator

@ctchou ctchou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for re-opening the PR. I have a few more comments.

Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
@akhilesh-balaji
Copy link
Copy Markdown
Author

akhilesh-balaji commented Apr 15, 2026

I have made all the changes. Also, in RegularLanguage, we have:

theorem IsRegular.iff_dfa {l : Language Symbol} :
    l.IsRegular ↔ ∃ State : Type, ∃ _ : Finite State,
      ∃ dfa : DA.FinAcc State Symbol, language dfa = l

At some point, we should ideally change this to use ∃ State : Type*, as ∃ State : Type makes proving theorems with State: Type* much more difficult. For now, I have used (in MyhillNerode):

theorem dfa_num_state_min {State : Type} {M : DA.FinAcc State α} [Finite State] :
    Nat.card State ≥ Nat.card (language M).NerodeQuotient

rather than State: Type*. Otherwise, when showing Finite (language M).NerodeQuotient, I would have to downcast the type universe, etc. The other theorems still use State: Type*.

@ctchou
Copy link
Copy Markdown
Collaborator

ctchou commented Apr 15, 2026

The State in the theorem IsRegular.iff_dfa is only a Type and not a Type* because this is how Language.IsRegular is defined in mathlib:
https://leanprover-community.github.io/mathlib4_docs/Mathlib/Computability/DFA.html#Language.IsRegular
And mathlib does this for a good reason: changing State : Type to State : Type* simply does not work in Leen's type system. We have a similar pattern in the definition of OmegaLanguage.IsRegular as well:
https://api.cslib.io/docs/Cslib/Computability/Languages/OmegaRegularLanguage.html#Cslib.%CF%89Language.IsRegular
You can try changing the Type to Type* and see what happens.

There are theorems that let you "lift" the state spaces to Type*:
https://leanprover-community.github.io/mathlib4_docs/Mathlib/Computability/DFA.html#Language.isRegular_iff
https://api.cslib.io/docs/Cslib/Computability/Languages/OmegaRegularLanguage.html#Cslib.%CF%89Language.isRegular_iff
But I usually don't bother with them, because we are dealing with finite types here and there seems no loss of generality in having those finite types living in Type 0, rather than some higher Type u.

Copy link
Copy Markdown
Collaborator

@ctchou ctchou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting close to an approvable shape in the organization and statements of definitions and theorems.

@chenson2018: maybe you want to take a look as well? I have not tried any proof golfing.

Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
Comment thread Cslib/Computability/Languages/MyhillNerode.lean Outdated
@ctchou
Copy link
Copy Markdown
Collaborator

ctchou commented Apr 16, 2026

@akhilesh-balaji Please mark those comments which you have fixed as resolved. Thanks!

@akhilesh-balaji
Copy link
Copy Markdown
Author

Done.

Please explain what "Nerode congruence" means. Later you talk about c_l without explaining what it means. Also, be consistent with the use of lowercase vs uppercase: either all l or all L. Don't mix them.

I have defined c_l now when stating the three parts of the theorem. Also, c_l has been defined when instance NerodeCongruence appears, and so I keep referring to this c_l. I can of course rephrase all the docstrings to be in plain English but this would arguably make it harder to read.

@ctchou
Copy link
Copy Markdown
Collaborator

ctchou commented Apr 16, 2026

Thanks for making the changes. I just pushed a commit to re-word the main comment a bit. I also turned on the CI.

I'm pretty happy with the overall structure of this PR now. I would like to take a closer look at the proofs, but that will take a few days. I hope @chenson2018 will be able to take a look as well.

@ctchou
Copy link
Copy Markdown
Collaborator

ctchou commented Apr 16, 2026

I removed some redundant lines. Please do not add redundant -- lines.

@ctchou
Copy link
Copy Markdown
Collaborator

ctchou commented Apr 16, 2026

@akhilesh-balaji There are some CI failures which you should look into. I always run locally the following steps for any PR of my own:

lake exe mk_all --module
lake build
lake lint
lake test

and correct all errors or warnings reported by them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants