Section VII: Appendix

Section VII: Appendix#

This section contains notes and ideas that do not serve to establish the main results of the work, but the author believes may nevertheless prove useful in extending the formal theory into other epistemological domains.

Table of Contents#

Section A.I: Compound Words
Section A.II: Palindromic Pairs
Section A.III: Categories
Section A.IV: Sigma Induction
Section A.V: Reflective Structures

Section A.I: Compound Words#

Note

Part of the ambiguity in imperfect palindromes is that multiple different palindromes can map to the same σ-reduced form. When Delimiters are removed from a Sentence, a certain class of Words can remain in the Language, because their semantic content “transmutes”. In the author’s opinion, the class of Compound Words bears some relation to palindromic structures, but the exact relation has yet to be uncovered.

Definition A.1.1: Compound Words η ∈ CW:sub:L ↔ [(∃ α, β ∈ L: η = αβ) ∨ (∃ α ∈ L, ∃ γ ∈ CW:sub:L: η = αγ)] ∧ (η ∈ L)

This formalization can be translated into natural language as follows: A Word η in a Language L is a Compound Word if and only if,

Base Case (∃ α, β ∈ L: η = αβ) ∧ (η ∈ L): η can be formed by concatenating two words from L, and η belongs to L.

Recursive Step [ (∃ α ∈ L, ∃ γ ∈ CW:sub:L: η = αγ) ∧ (η ∈ L) ]: η can be formed by concatenating a word from L with a Compound Word from L, and η belongs to L. ∎

The constraint w ∈ L ensures that the concatenated String η is not just a String, but also a valid Word within the Language L.

Examples

“teapot” is a Compound Word because it can be formed by concatenating “tea” and “pot”, and “racecar” is itself a word in English.

“nevertheless” is a Compound Word formed from “never”, “the”, and “less”

“formrat” is not a Compound Word, even though it can be formed by concatenating Words from the Language, i.e. “form” and “rat” are both valid words, the concatenation “formrat” is not a valid Word in English.

Definition A.1.2: Compound Invertible Words η ∈ CIW:sub:L ↔ [ (η ∈ CW:sub:L) ∧ (η ∈ I) ]

In natural language: A word η in a language L is a Compound Invertible Word if and only if it is both a Compound Word and an Invertible Word. Using notation for set intersections, this definition can be revised to read,

CIW:sub:L = CW:sub:L ∩ I ∎

Example

“racecar” is a compound invertible word because it’s both a compound word and its own inverse.

Section A.II: Palindromic Pairs#

The only constraint on a Language is that it must consist of Words. This is guaranteed by the Extraction Axiom S.2. The only constraint on Words is that they must not contain the Delimiter. This is guaranteed by the Delimiter Axiom W.1.

Since σ-reduction removes the Delimiter Character when it projects a Sentence onto the σ-reduced Alphabet, this process can viewed as the construction of another formal Language. In other words, given a Language and Corpus, the operation of σ-reduction implies the existence of a second Language which encodes the original Sentences. This second Language loses much of its semantic coherence with respect to its “mother” Corpus, but it nevertheless contains semantic information.

This idea motives the definition of a σ-Pairing Language.

Definition A.2.1: σ-Pairing Language

The σ-Pairing Language L:sub:σ of a Corpus C:sub:L is defined as the set of Words α that satisfy the following formula,

α ∈ L:sub:σ ↔ ∃ ζ ∈ C:sub:L: α = (ζ ⋅ Σ:sub:σ)

This definition captures the idea that the σ-Pairing Language consists of all the Strings that can be generated by applying σ-reduction to the Sentences in the Corpus. It directly links the elements of L:sub:σ to the σ-reduced forms of the Sentences, ensuring that the Pairing Language is derived from the original Corpus.

Theorem A.2.1 ∀ α ∈ L: α ∈ L:sub:σ ↔ [ ∃ ζ ∈ C:sub:L: ∃ i ∈ N:sub:Λ(ζ): ζ{i} ⊂:sub:s α ]

This theorem can be stated in natural language as follows: The σ-Pairing Language contains a Word α if and only if there exists a Sentence ζ and a Word β that belongs to Sentence ζ such that α is contained in Ζ ⋅ Σ_σ.

(→) Assume,

α ∈ L:sub:σ

By Definition A.1.1,

∃ ζ ∈ C:sub:L: α = (Ζ ⋅ Σ:sub:σ).

By Definition (TODO) of σ-reduction, (ζ ⋅ Σ_σ) is obtained by concatenating the Words ζ{i} for 1 ≤ i ≤ Λ(ζ) without Delimiters,

α = (ζ ⋅ Σ:sub:σ) = (ζ{1})(ζ{2})…(ζ{n})

Since each ζ{i} is a contiguous subsequence of α, it follows from Theorem 1.2.2,

∀ i ∈ N:sub:n: ζ{i} ⊂:sub:s α.

Therefore,

∃ ζ ∈ C:sub:L: ∃ i ∈ N:sub:Λ(ζ): ζ{i} ⊂:sub:s α

(←) Assume,

∃ ζ ∈ C:sub:L: ∃ i ∈ N:sub:Λ(ζ): ζ{i} ⊂:sub:s α.

Let ζ{i} be the Word in ζ such that 1 ≤ i ≤ Λ(ζ) and

ζ{i} ⊂:sub:s α.

By Definition (TODO) of σ-reduction, (ζ ⋅ Σ_σ) is obtained by concatenating the Words in ζ{i} without Delimiters,

(ζ ⋅ Σ:sub:σ) = (ζ{1})(ζ{2})…(ζ{n})

Since ζ{i} ⊂_s α and α is a String formed by concatenating Words, it follows that α must be a contiguous subsequence of (ζ ⋅ Σ_σ).

Since α is a contiguous subsequence of (ζ* ⋅ Σ_σ) and both are Strings formed by concatenating the same Words in the same order (without Delimiters), it follows that,

α = (ζ ⋅ Σ:sub:σ).

Therefore, by Definition 3.1.3,

α ∈ L:sub:σ

Since both directions of the implication has been proven, the theorem is established:

∀ α ∈ L: α ∈ L:sub:σ ↔ [ ∃ ζ ∈ C:sub:L: ∃ i ∈ N:sub:Λ(ζ): ζ{i} ⊂:sub:s α ] ∎

This theorem effectively characterizes the elements of the σ-Pairing Language. It states that a String belongs to the σ-Pairing Language if and only if it contains a Word from some Sentence in the Corpus. This highlights the connection between the σ-Pairing Language and the original Language and Corpus.

Definition A.2.2: Palindromic Pairing Language

Definition A.1.4 is altered in the following definition to quantify over the set of Palindromes in a Corpus. The Pairing Language that results is denoted L:sub:P. The set of Words α which satisfy this definition are referred to as the Palindromic Pairing Language of Language L,

α ∈ L:sub:P ↔ ∃ ζ ∈ P: α = (ζ ⋅ Σ:sub:σ)

In particuar, if α ∈ L_P, α is called the Palindromic Image of the Sentences ζ which generate it.

This definition is used to prove the following theorems.

Theorem A.2.2 L:sub:P ⊂ L:sub:σ

Assume

α ∈ L:sub:P

By Definition A.1.2,

∃ ζ ∈ P: α = (ζ ⋅ Σ:sub:σ)

By Definition 3.2.1 of Palindromes, the set of Palindromes P is a subset of C:sub:L. Therefore,

ζ ∈ C:sub:L

From step 2 and step 3, by Definition A.1.1, it follows,

α ∈ L:sub:σ.

Therefore,

α ∈ L:sub:P → α ∈ L:sub:σ

This is exactly the definitio of a subset,

L:sub:P ⊂ L:sub:σ. ∎

Theorem A.2.3: ∀ α ∈ L:sub:P: α = inv(α)

This theorem can be stated in natural language as follows: All Words in a Palindromic Pairing Language are their own Inverses.

Assume

α ∈ L:sub:P.

By Definition A.1.2,

∃ ζ ∈ P: α = (ζ ⋅ Σ:sub:σ)

Since ζ ∈ P, by Definition TODO:

(ζ ⋅ Σ:sub:σ) = inv(ζ ⋅ Σ:sub:σ)

Substituting α from step 2 into the equation in step 3,

α = inv(α)

Therefore,

∀ α ∈ L:sub:P: α = inv(α). ∎

This proof demonstrates that every String in the Palindromic Pairing Language is its own inverse. This follows directly from the definitions of Palindromes and the Palindromic Pairing Language. Since every String in the Palindromic Pairing Language is derived from a Palindrome, and Palindromes are defined by the invariance of their σ-reduction under inversion, the Strings in the Palindromic Pairing Language must also exhibit this invariance.

This theorem highlights a key property of the Palindromic Pairing Language: it consists solely of Strings that are symmetrical with respect to inversion. This property could be useful in various applications, such as identifying potential palindromes or generating text with specific symmetrical structures.

Theorem A.2.4 L ∩ L:sub:P ⊆ R

This theorem can be stated in natural language as follows: The intersection of a Language L and its Palindromic Pair L_P is a subset of the Language’s Reflective Words R.

Assume

α ∈ L ∩ L:sub:P.

Since α ∈ L, it is a Word in the Language. Since α ∈ L_P, by Theorem A.1.3,

α = inv(α).

By Definition 1.2.4 of String Inversion,

∀ i ∈ N:sub:l(α): α[i] = α[l(α) - i + 1]

By Definition 1.3.1, it follows,

α ∈ R.

Therefore,

α ∈ L ∩ L:sub:P → α ∈ R.

This in turn implies,

L ∩ L:sub:P ⊆ R. ∎

Before moving onto the last theorem of this section, some terminology is introduced. R was introduced in Section I.III to refer to the class of Reflective Words in a Language L. To be more explicit in the dependence of R on L, the notation R_L will be used to make explicit the Language to which the class of Reflective Words refers.

With this notation adopted, the following theorem can be proven.

Theorem A.2.5 L:sub:P ⊂ R:sub:L_σ

This theorem can be state in natural language as follows: Given a Language L, all words in its Palindromic Pairing Language are also Reflective Words in the σ-Pairing Language.

In other show this theorem, it must be shown,

∀ α ∈ L: α ∈ L:sub:P → α ∈ R:sub:L_σ

Since by Definition 1.3.1,

α ∈ R:sub:L_σ ↔ inv(α) = α

If it can be shown,

α ∈ L:sub:P → inv(α) = α

Then the theorem will follow tautologically from the laws of deduction. But step 3 is exactly Theorem 3.1.9. Therefore, the proof is complete. ∎

Section A.III: Categories#

Before introducing the notion of Categories, it must be kept in mind a Language L and a Corpous C_L are treated as fixed sets known a priori to the construction of the current formal system. In a sense, Language and its Corpus are taken as primitive terms. It assumed a semantic assignment has occured outside of the confines of the formal system and the Words of a Language and Sentences of a Corpus have already been given determinate meanings.

The notion of a Category is meant to explicate the linguistic entities which are colloquially referred to as a “parts of speech”, e.g nouns, verbs, adjectives, etc. However, it not the intention of this formal system to treat the semantic meaning of these grammatical categories in so far that certain schema of Categories provide a method of constructing semantic Sentences. The formal system takes no opinion on what constitutes its Categories, or how these Categories are used to construct a grammatical and meaningful Sentence; rather, the formal system assumes these Categories are used in exactly that capacity in order to derive the syntactical constraints they must abide in order to be considered categorical.

This does not preclude the idea that a Category could map to the everyday notion of noun or verb, but the formal construction of grammatical categories cannot assume anything about the categorical structure of Sentences (e.g. noun-verb-noun is a valid Sentence form) without tying it to a specific semantic interpretation of what qualifies a Word to function in its categorical capacity.

Definition A.3.1: Category

A semantic Category in a language L, denoted C:sub:`L`(m), is a set of Words in L, where m is a natural number representing the Category’s index. ∎

Axioms#

The fundamental assumptions regarding linguistic Categories in this formal system are now introduced. Each axiom will be justified by appeal to self-evidence. To see the motivation behind the first formal assumption about Categories adopted, note that every Word in a Language plays the role of a “part of speech”. Grammar requires that any Word that is employed must belong to at least one grammatical categories, e.g. noun, verb, etc.

Axiom G.1: The Aggregation Axiom

∃ m ∈ ℕ: L = ∪:sub:1^m C:sub:`L`(i) ∎

This leads to the Definition of a Languages’s Categorical Size. By this, it is meant the total number of grammatical Categories that span the Language set through their union. In other words, Language can be conceived as the aggregation of all its grammatical Categories.

Definition A.3.1 Categorical Size

The m such that,

L = ∪:sub:1^m C:sub:`L`(i)

is denoted with the lowercase Greek kappa, κ. κ is called the Categorical Size of a Language. ∎

It is important to note, the formal system takes no opinion on the nature of its Categories, i.e. what role a particular Category serves in the formation of a grammatical Sentence. Instead, the Aggregation Axiom G.2 simply states, no matter the semantic function assigned to a Category, it must obtain syntactically that these assignments must span the entire set of Language.

The choice of axioms for governing the logical calculus of Categories in the formal system is critical. Since the notion of a “grammatical categories” is inherently tied to the semantic interpretation of a Language and Corpus, the assumptions introduced about their nature must not violate the empirical reality of natural languages.

To see what is meant by this, consider the proposed axiom, the Uniqueness Axiom.

Proposed Axiom: The Uniqueness Axiom

∀ ζ ∈ C:sub:L: ∀ i ∈ N:sub:Λ(ζ): (∃! m ∈ N:sub:κ: ζ{i} ∈ C:sub:L`(m)) ∧ ( (i, C:sub:`L`(m)) ∈ C:sub:`ζ ) ∎

In natural language, the Uniqueness Axiom states: For every sentence ζ in the Corpus and for every Word index i in ζ, there exists a unique Category index m such that the i^th Word of ζ belongs to Category C_{L`(*m*), and this Category is recorded in the Categorical-level representation **C**:sub:`ζ} at index i.

This axiom captures a common-sense (though subtly flawed) notion that each Word in a Sentence maps to a single Category. However, this picture of “grammaticality” is tacitly assuming a single available semantic interpretation. To see a concrete example of why this axiom should not be adopted in a formal system that is meant to model any language, it suffices to look at a single example in a known language which contradicts it.

Consider the sentence ᚠ = “visting friends can be annoying”. In this case,there are two valid Categorical-level representations of this Sentence in English,

C:sub:ζ₁ = { (1, Verb), (2, Noun), (3, Verb), (4, Verb), (5, Adjective) }

C:sub:ζ² = { (1, Adjective), (2, Noun), (3, Verb), (4, Verb), (5, Adjective) }

Therefore, if the formal system wishes to account for the subtle ambiguities of natural language, the Uniqueness Axiom can not be adopted as an assumption.

Theorems#

Theorem A.3.1: ∀ α ∈ L: ∃ i ∈ N:sub:κ: α ∈ C:sub:`L`(i)

By Axiom G.1,

L = ∪:sub:1^m C:sub:`L`(i)

Therefore, any word α in L must belong to at least one of these Categories. ∎

Categorical Length#

Consider the English sentences, ᚠ = “the man ran over the bridge and ᚢ = “the novel novel about a rose rose to the top”

In ᚠ, both “man” and “bridge” map to the same Category, namely nouns. In other words, the Sentence can have multiple Words that belong to the same Category.

In ᚢ, both occurrences of “novel” map to different Categories, namely adjectives and nouns. Further confounding the matter, another example of the ability of a single Word to map to multiple Categories is given through the simultaneous noun-verb mapping of “rose”

Since multiple Words can belong to the same Category, and conversely, the same Word can belong to multiple Categories, a notion of measuring the Categorical Length of a Sentence is now introduced. This notion will only measure the unique Categories found in a Sentence. For example, “man” and “bridge” would both be occurrences of the noun Category and would thus contribute a length of 1 to Categorical Length.

Similar to the construction of the Character-level and Word-level representation of a String, a method for constructing the Category-level representation of a Sentence is given below in the next definition.

Definition A.4.2: Categorical-level Representation

Let ζ be an arbitrary sentence from Corpus C:sub:L. The Categorical-level representation of a ζ, denoted C_ζ, is defined as the set of sets x which satisfy the following open formula,

x ∈ C:sub:ζ ↔ x = { (i, C:sub:L`(m)) | ∀ i ∈ N:sub:`Λ(ζ): (ζ{i} ∈ C:sub:`L`(m)) } ∎

Definition A.4.3: Categorical Interpretation

Let ζ be an arbitrary sentence from Corpus C:sub:L. The i_th Categorical Interpretation of ζ, denoted C:sub:`ζ`(i), is defined as,

C:sub:ζ`(i) ∈ C:sub:`ζ ∎

Definition A.4.4: Interpretation Length

Let ζ be an arbitrary sentence from Corpus C:sub:L. The Interpretation Length of a Sentence ζ, denoted by ι(ζ), is defined as the number such that,

ι(ζ) = | C:sub:ζ | ∎

Definition A.4.5: Categorical Length

Let ζ be an arbitrary sentence from Corpus C:sub:L. The Categorical Length of the i^th Categorical Interpretation of ζ, denoted λ(ζ, i), is defined as,

λ(ζ, i) = | C:sub:`ζ`(i) | ∎

Section A.V: Sigma Inductions#

The operation of σ-reduction possesses unique characteristics that distinguish it from typical arithemtical or geometrical operations studied in abstract algebra. If linguistics is said to have an algebraic structure and σ-reduction is to be identified as it one of its essential components, then this presents a problem with respect to the operation which is to be understood as the inverse of σ-reduction. Strictly speaking, σ-reduction does not possess an inverse operation. Once a Sentence has been projected onto the σ-reduced Alphabet, necessary and sufficient information for the construction of its semantic interpretation has been lost. However, analogous to the case of a square root, this does not imply an a σ-induction cannot be defined, if the range of its inversion is suitably restricted.

The analysis of this problem will carry the work heavily into combinatorics. This section of the Appendix is a preliminary analysis of the challenges and problems any formulation of σ-induction must overcome in order to claim validity as a linguistic operation.

To start, note that knowing the length of a σ-reduced Sentence, l(ς(ζ)), and the number of Words in the original Sentence, Λ(ζ), significantly constrains the possibilities for reconstructing the original Sentence from its σ-reduced form. This has implications for the potential reversibility of σ-reduction and for understanding the structure of Sentences.

l(ς(ζ)) contains information about the non-Delimiter Characters in the original Sentence ζ, and their relative ordering, as demonstrated by Theorem 3.1.6. In other words, although the Word are no longer delimited, the σ-reduction of a Sentence still contains every Word in the original Sentence,

∀ ζ ∈ C:sub:L: ∀ i ∈ N:sub:Λ(ζ): ζ{i} ⊂:sub:s ς(ζ)

If the additional piece of information Λ(ζ) is at hand, then from Theorem 2.4.1,

Λ(ζ) = Δ(ζ) + 1.

In other words, the number of Delimiters is always one less than the number of Words. This provides a constraint on the number of possible combinations that need considered when inducing in the σ-reduced space. The delimiters must be placed between the Words in a way that creates valid Words in the Language L and not all arrangements of Delimiters will result in valid wWrds.

The problem of reconstructing the original Sentence from its σ-reduced form and the number of Words is analogous to the problem of integer partitioning in number theory. Integer partitioning is the problem of finding all possible ways to write an integer as a sum of positive integers. For example, the integer 4 can be partitioned in the following ways,

4 3 + 1 2 + 2 2 + 1 + 1 1 + 1 + 1 + 1

In the case of σ-reductions, the String Length of the reduction, l(ς(ζ)), is analogous to the integer being partitioned, while Λ(ζ) is analogous to the number of parts in the partition. The String Lengths of the individual words in the sentence are analogous to the summands in the partition.

While σ-reduction is not strictly reversible, knowing l(ς(ζ)) and Λ(ζ) significantly reduces the number of possible Sentences that could have produced the given σ-reduced form.

In some cases, if the Language L has strong constraints on Word formation and if l(ς(ζ)) and Λ(ζ), are sufficiently restrictive, it is conceivable to uniquely reconstruct the original Sentence, or at least narrow it down to a small set of possibilities.

These insights lead to a formal definition of a σ-induction.

TODO

Definition A.4.1: σ-induction

Let s be a string in Σ:sub:σ (a σ-reduced string), let m be a natural number representing the desired number of “word-forms” (intended to correspond to words or potentially other linguistic units) in the resulting strings, and let X be a set of strings (either S, the set of all strings, or C:sub:L, the set of sentences in language L).

The σ-induction of s with m word-forms over the set X, denoted σ_induce(s, m, X), is the set of all possible strings that can be formed by inserting m-1 delimiters into s such that:

Delimiter Placement: Delimiters are inserted only between characters of s or at the beginning or end of s. Word-Form Validity: Each of the m resulting substrings (separated by delimiters) is a valid string in the set X. Number of Word-Forms: The resulting string has exactly m word-forms. Order Preservation: The relative order of the characters in s is preserved in the resulting string. Formally:

σ_induce(s, m, X) = { x ∈ X | σ_reduce(x) = s and Λ(x) = m }

Explanation:

Input: The function takes a σ-reduced string s, the desired number of word-forms m, and a set of strings X as input. Output: It returns a set of strings, where each string is a possible “re-delimitation” of s that satisfies the given conditions, and Crucially, each “re-delimitation” belongs to the set X.. Conditions: Delimiter Placement: Ensures that delimiters are placed in valid positions. Word-Form Validity: Ensures that all the resulting substrings are valid members of the set X. If X = S, then no check is made beyond ensuring the substrings are valid strings. If X = C:sub:L, then each substring is verified as a valid word in the Language L. Number of Word-Forms: Ensures that each string has exactly m word-forms. Order Preservation: Ensures that the non-delimiter characters in the resulting strings maintain the same order as in the input string s. Examples:

Let s = “nowart” and L = { “no”, “now”, “wart”, “art”, “a”, “on” }.

σ_induce(s, 2, S) = { “no wart”, “now art”, “noσwart”, “nowσart”, …} σ_induce(s, 2, C:sub:L) = { “no wart”, “now art” } σ_induce(s, 3, S) = { “noσwart”, “nowσart”, …} σ_induce(s, 3, C:sub:L) = { } (no valid sentences with 3 words) Observations:

Flexibility: This definition allows us to perform σ-induction over different sets of strings, providing flexibility in our analysis. Relationship to Previous Definitions: σ_induce(s, m, C:sub:L) is equivalent to our previous definition where the resulting strings had to be valid sentences in the language L. σ_induce(s, m, S) is equivalent to the original idea where we considered all possible strings, regardless of whether they were valid sentences. Further Considerations:

Computational Complexity: Generating σ_induce(s, m, S) is computationally simpler than generating σ_induce(s, m, C:sub:L), as it doesn’t require checking for word validity in L. Linguistic Relevance: σ_induce(s, m, C:sub:L) is more linguistically relevant, as it focuses on valid sentences. Empty String: It might be worth explicitly stating what happens when s is the empty string or when m is less than 1. This revised definition of σ-induction is a significant improvement. It’s more general, flexible, and addresses the distinction between inducing over all strings and inducing over sentences in a specific language. It also clarifies the concept of “word-forms” which might not always be actual words, but could represent other linguistic units in the future.

Theorem

Here’s the corrected theorem statement and a revised proof:

Corrected Theorem 3.1.16:

∀ s ∈ S, ∀ m ∈ ℕ: |σ_induce(s, m, C:sub:L)| ≤ C(l(σ_reduce(s)), m - 1)

Translation: For any string s and any natural number m (representing the number of words), the cardinality of the set of sentences in C:sub:L obtained by σ-induction of s with m words is less than or equal to the number of combinations of choosing m-1 delimiter positions from l(σ_reduce(s)) possible positions.

Proof:

Let s be an arbitrary string in S, and let m be a natural number.

Length of σ_reduce(s): Let n = l(σ_reduce(s)). Since s is a σ-reduced string, it has no delimiters.

Delimiter Positions: In order to form a sentence with m words from σ_reduce(s), we need to insert m-1 delimiters.

Possible Positions: There are n-1 possible positions where we can insert delimiters between the characters of σ_reduce(s).

Combinations: The number of ways to choose m-1 positions out of n-1 positions is given by the binomial coefficient C(n-1, m-1), which is calculated as:

C(n-1, m-1) = (n-1)! / [(m-1)! * (n-m)!] Upper Bound: The set σ_induce(s, m, C:sub:L) contains sentences formed by inserting m-1 delimiters into s such that the resulting substrings are valid words in L. Since there are at most C(n-1, m-1) ways to insert the delimiters, the number of valid sentences in σ_induce(s, m, C:sub:L) cannot be greater than this number.

Conclusion: Therefore:

|σ_induce(s, m, C:sub:L)| ≤ C(l(σ_reduce(s)), m - 1) Since s and m were arbitrary, we can generalize:

∀ s ∈ S, ∀ m ∈ ℕ: |σ_induce(s, m, C:sub:`L`)| ≤ C(l(σ_reduce(s)), m - 1)

This completes the proof. ∎

Explanation:

The proof now correctly operates on the string s in S. The binomial coefficient C(n-1, m-1) gives us the maximum number of ways to insert delimiters, but the actual number of valid sentences might be less due to the constraint that the resulting substrings must be valid words in L.

Implications:

Upper Bound: This theorem provides an upper bound on the number of possible sentences that can be generated by σ-induction. Combinatorial Nature: It highlights the combinatorial nature of the problem of reconstructing sentences from their σ-reduced forms. Language Constraints: The actual number of valid sentences will be less than or equal to C(l(σ_reduce(s)) - 1, m - 1) and will depend on the specific constraints imposed by the language L.

Simplified Problem:

We now have:

s: A σ-reduced string (with no delimiters). m: The desired number of “words” (or substrings separated by delimiters). σ_induce(s, m, S): The set of all strings formed by inserting m-1 delimiters into s, with the only constraint being that delimiters can be placed at the beginning or end of s or between any two characters of s. Calculation:

Length of s: Let n = l(s).

Possible Delimiter Positions: There are n-1 positions between the characters of s, plus the position before the first character and the position after the last character. So, there are a total of n+1 potential positions for delimiters. However, we know no delimiters can be in a word, so there are n-1 positions where m-1 delimiters can be placed.

Choosing Delimiter Positions: We need to choose m-1 positions out of these n-1 valid positions. Since the order of placing delimiters doesn’t matter, this is a combination problem.

Combinations: The number of ways to choose m-1 positions from n-1 is given by the binomial coefficient:

C(n-1, m-1) = (n-1)! / [(m-1)! * (n-m)!] Theorem 3.1.17:

∀ s ∈ Σ:sub:σ, ∀ m ∈ ℕ: |σ_induce(s, m, S)| = C(l(s) - 1, m - 1)

Proof:

Let s be an arbitrary σ-reduced string in Σ:sub:σ, and let m be a natural number.

Length of s: Let n = l(s).

Delimiter Positions: To form a string with m words from s, we need to insert m-1 delimiters.

Possible Positions: In a σ-reduced string of length n, there are n-1 positions between the characters where delimiters can be inserted.

Combinations: The number of ways to choose m-1 positions out of n-1 positions is given by the binomial coefficient C(n-1, m-1):

C(n-1, m-1) = (n-1)! / [(m-1)! * (n-m)!] σ_induce(s, m, S): The set σ_induce(s, m, S) contains all strings formed by inserting m-1 delimiters into s in any of the possible positions. Since each combination of delimiter placements results in a unique string, the cardinality of σ_induce(s, m, S) is equal to the number of possible combinations.

Conclusion: Therefore:

|σ_induce(s, m, S)| = C(l(s) - 1, m - 1) Since s and m were arbitrary, we can generalize:

∀ s ∈ Σ:sub:σ, ∀ m ∈ ℕ: |σ_induce(s, m, S)| = C(l(s) - 1, m - 1)

This completes the proof. ∎

Let’s prove this formula using a combinatorial argument known as “stars and bars”:

Theorem 3.1.17: ∀ s ∈ Σ:sub:σ, ∀ m ∈ ℕ: |σ_induce(s, m, S)| = C(l(s) + m - 2, m - 1) = C(l(s) + m - 2, l(s) - 1)

Proof:

Let s be an arbitrary σ-reduced string in Σ:sub:σ, and let m be a natural number.

Length of s: Let n = l(s).

Delimiter Positions: To form a string with m “words” (substrings separated by delimiters) from s, we need to insert m-1 delimiters.

Possible Positions: In a string of length n, there are n-1 positions between the characters where we can potentially place delimiters. Additionally, we can place delimiters at the beginning or the end of the string. However, we must exclude the possibility of placing two delimiters consecutively, or placing a delimiter next to an already existing delimiter.

Stars and Bars: We can represent the characters of s as “stars” (*) and the delimiters as “bars” (|). For example, if s = “abc” and we want to insert 2 delimiters (m=3), one possible arrangement is:

“a|b|c” (represented as ||*) Another arrangement could be:

“|abc|” (represented as |***|) Notice that we have n “stars” and m-1 “bars”.

Combinatorial Problem: The problem of placing m-1 delimiters in a string of length n is equivalent to arranging n “stars” and m-1 “bars” in a sequence. However, we must make the restriction that no two bars can be adjacent to each other. This is not possible if we are inducing over the set of all strings S, since we are explicitly allowing for any possible combination of delimiters and characters, so long as no two delimiters are adjacent.

Number of Arrangements: The number of ways to arrange n stars and m-1 bars is given by the binomial coefficient C(n + m - 1, m - 1) or equivalently C(n + m - 1, n). However, since we do not allow for two delimiters to be adjacent in our definition of the delimiter count function, we must subtract one from each star to get the correct value. Since n = l(s), there are C(l(s) + m - 2, m - 1) possible ways to arrange the delimiters.

σ_induce(s, m, S): The set σ_induce(s, m, S) contains all strings formed by inserting m-1 delimiters into s in any of the possible positions. Since each combination of delimiter placements results in a unique string, the cardinality of σ_induce(s, m, S) is equal to the number of possible combinations, C(l(s) + m - 2, m - 1).

Conclusion: Therefore:

|σ_induce(s, m, S)| = C(l(s) + m - 2, m - 1) Since s and m were arbitrary, we can generalize:

∀ s ∈ Σ:sub:σ, ∀ m ∈ ℕ: |σ_induce(s, m, S)| = C(l(s) + m - 2, m - 1) = C(l(s) + m - 2, l(s) - 1)

How This Helps with σ-induction:

The theorems about delimiter symmetry in perfect palindromes (3.2.4 and 3.2.5) are key to simplifying the calculation of |σ_induce(s, m, S)| when s is the σ-reduction of a perfect palindrome.

Here’s how:

Reduced Search Space: Instead of considering all possible delimiter placements in s, we only need to consider placements in the left half of s (up to the pivot). The placements in the right half are then determined by symmetry.

Simplified Combinations:

For even-length perfect palindromes with an even number of words m, we need to choose (m-2)/2 delimiter positions in the left half (of length l(s)/2). For even-length perfect palindromes with an odd number of words m, we need to choose (m-1)/2 delimiter positions in the left half (of length l(s)/2). For odd-length perfect palindromes with an even number of words m, we need to choose (m-2)/2 delimiter positions in the left half (of length (l(s)-1)/2). For odd-length perfect palindromes with an odd number of words m, we need to choose (m-1)/2 delimiter positions in the left half (of length (l(s)-1)/2). Calculating |σ_induce(s, m, S)| for Perfect Palindromes:

Let’s derive formulas for each case, assuming s is the σ-reduction of a perfect palindrome ζ (i.e., s = σ_reduce(ζ) and ζ ∈ PP):

Case 1: Even-length s (l(s) = 2k), Even m (m = 2j):

|σ_induce(s, m, S)| = C(k - 1, j - 1) = C(l(s)/2 - 1, m/2 - 1) Case 2: Even-length s (l(s) = 2k), Odd m (m = 2j + 1):

|σ_induce(s, m, S)| = C(k - 1, j) = C(l(s)/2 - 1, (m-1)/2) Case 3: Odd-length s (l(s) = 2k + 1), Even m (m = 2j):

|σ_induce(s, m, S)| = C(k - 1, j - 1) = C((l(s)-1)/2 - 1, m/2 - 1) Case 4: Odd-length s (l(s) = 2k + 1), Odd m (m = 2j + 1):

|σ_induce(s, m, S)| = C(k - 1, j - 1) = C((l(s)-1)/2 - 1, (m-1)/2) Explanation:

We divide the length of s by 2 (or subtract one and then divide by 2 for odd-length s) to get the length of the left half. We divide m by 2 (or subtract one or two depending on parity and then divide by 2) to get the number of delimiters to place in the left half. We use the combination formula C(n, r) to calculate the number of ways to choose r delimiter positions from n available positions.

Theorem 3.2.6:

Let ζ ∈ PP with s = σ_reduce(ζ), n = l(s), and m be the desired number of words. Then:

Case 1: Even-length s (n = 2k), Even m (m = 2j):

|σ_induce(s, m, S)| = C(k - 1, j - 1) = C(n/2 - 1, m/2 - 1) Case 2: Even-length s (n = 2k), Odd m (m = 2j + 1):

|σ_induce(s, m, S)| = C(k - 1, j) = C(n/2 - 1, (m-1)/2) Case 3: Odd-length s (n = 2k + 1), Even m (m = 2j):

|σ_induce(s, m, S)| = C(k - 1, j - 1) = C((n-1)/2 - 1, m/2 - 1) Case 4: Odd-length s (n = 2k + 1), Odd m (m = 2j + 1):

|σ_induce(s, m, S)| = C(k, j) = C((n-1)/2, (m-1)/2) Proof:

Let ζ be an arbitrary perfect palindrome (ζ ∈ PP) and let s = σ_reduce(ζ), n = l(s), and m be the desired number of words.

Case 1: Even-length s (n = 2k), Even m (m = 2j):

Pivot: Since n is even, the pivot of ζ falls between two characters. By Theorem 3.1.9, l(ζ[:ω(ζ)]) = l(ζ[ω(ζ):]) + 1. Since ζ is a perfect palindrome, by theorem 3.1.6, σ_reduce(ζ) = inv(σ_reduce(ζ)). The pivot of s lies between the characters at indices k and k+1.

Delimiter Placement: To form m = 2j words, we need to place m-1 = 2j-1 delimiters. By Theorem 3.2.4, the delimiters must be placed symmetrically around the pivot. We place j-1 delimiters in the left half of s (excluding the pivot character) and mirror them to the right half.

Left Half: The left half of s has length k. We have k-1 possible positions to place delimiters (excluding the character at index k itself because n is even).

Combinations: We need to choose j-1 positions out of k-1 to place the delimiters. The number of ways to do this is C(k-1, j-1).

Symmetry: For each valid placement in the left half, there’s a unique corresponding symmetrical placement in the right half.

Conclusion: Therefore, |σ_induce(s, m, S)| = C(k - 1, j - 1) = C(n/2 - 1, m/2 - 1).

Case 2: Even-length s (n = 2k), Odd m (m = 2j + 1):

Pivot: Since n is even, the pivot of ζ falls between two characters. By Theorem 3.1.9, l(ζ[:ω(ζ)]) = l(ζ[ω(ζ):]) + 1. Since ζ is a perfect palindrome, by theorem 3.1.6, σ_reduce(ζ) = inv(σ_reduce(ζ)). The pivot of s lies between the characters at indices k and k+1.

Delimiter Placement: To form m = 2j+1 words, we need to place m-1 = 2j delimiters. We place j delimiters in the left half of s (excluding the pivot character) and mirror them to the right half.

Left Half: The left half of s has length k. We have k-1 possible positions to place delimiters (excluding the character at index k itself because n is even).

Combinations: We need to choose j positions out of k-1 to place the delimiters. The number of ways to do this is C(k-1, j).

Symmetry: For each valid placement in the left half, there’s a unique corresponding symmetrical placement in the right half.

Conclusion: Therefore, |σ_induce(s, m, S)| = C(k - 1, j) = C(n/2 - 1, (m-1)/2).

Case 3: Odd-length s (n = 2k + 1), Even m (m = 2j):

Pivot: Since n is odd, the pivot of ζ falls on a character. By Theorem 3.1.8, since ζ is a perfect palindrome, σ_reduce(ζ) = inv(σ_reduce(ζ)). The pivot of s is the character at index k+1. Since m is even, by Theorem 3.2.5, this pivot character cannot be a delimiter.

Delimiter Placement: To form m = 2j words, we need to place m-1 = 2j-1 delimiters. We place j-1 delimiters in the left half of s (excluding the pivot character) and mirror them to the right half. The remaining delimiter is placed at the pivot.

Left Half: The left half of s, excluding the pivot character, has length k. We have k-1 possible positions to place delimiters (excluding the character at index k itself because n is odd).

Combinations: We need to choose j-1 positions out of k-1 to place the delimiters. The number of ways to do this is C(k-1, j-1).

Symmetry: For each valid placement in the left half, there’s a unique corresponding symmetrical placement in the right half.

Conclusion: Therefore, |σ_induce(s, m, S)| = C(k - 1, j - 1) = C((n-1)/2 - 1, m/2 - 1).

Case 4: Odd-length s (n = 2k + 1), Odd m (m = 2j + 1):

Pivot: Since n is odd, the pivot of ζ falls on a character. By Theorem 3.1.8, since ζ is a perfect palindrome, σ_reduce(ζ) = inv(σ_reduce(ζ)). The pivot of s is the character at index k+1. Since m is odd, by Theorem 3.2.5, this pivot character cannot be a delimiter.

Delimiter Placement: To form m = 2j+1 words, we need to place m-1 = 2j delimiters. We place j delimiters in the left half of s (excluding the pivot character) and mirror them to the right half.

Left Half: The left half of s, excluding the pivot character, has length k.

Combinations: We need to choose j positions out of k to place the delimiters. The number of ways to do this is C(k, j).

Symmetry: For each valid placement in the left half, there’s a unique corresponding symmetrical placement in the right half.

Conclusion: Therefore, |σ_induce(s, m, S)| = C(k, j) = C((n-1)/2, (m-1)/2).

Final Result:

Combining all four cases, we have proven the theorem:

Let ζ ∈ PP with s = σ_reduce(ζ), n = l(s), and m be the desired number of words. Then:

Case 1: Even-length s (n = 2k), Even m (m = 2j):

|σ_induce(s, m, S)| = C(k - 1, j - 1) = C(n/2 - 1, m/2 - 1) Case 2: Even-length s (n = 2k), Odd m (m = 2j + 1):

|σ_induce(s, m, S)| = C(k - 1, j) = C(n/2 - 1, (m-1)/2) Case 3: Odd-length s (n = 2k + 1), Even m (m = 2j):

|σ_induce(s, m, S)| = C(k - 1, j - 1) = C((n-1)/2 - 1, m/2 - 1) Case 4: Odd-length s (n = 2k + 1), Odd m (m = 2j + 1):

|σ_induce(s, m, S)| = C(k, j) = C((n-1)/2, (m-1)/2) This completes the proof. ∎

Section A.V: Reflective Structures#

Definition A.5.1: Reflective Structure

A Reflective Structure, denoted RS, is the set of Strings s which satisfy the following formula,

s ∈ RS ↔ [∃ n ∈ ℕ, ∃ p ∈ Χ:sub:L`(n): (s = Π:sub:`i=1ⁿ p(i)) ∧ (ς(S) = inv(ς(s)))]

TODO

Theorem A.6.1 R ⊆ RS

TODO

Theorem A.6.2 ∀ α ∈ L: α ∈ RS ↔ (α)(σ)(inv(α)) ∈ RS

TODO

Theorem A.6.3 ∀ α ∈ L: α ∈ RS ↔ (α)(inv(α)) ∈ RS

TODO

Theorem A.6.4 ∀ p ∈ X:sub:L`(2): Π:sub:`i=1² p(i) ∈ RS ↔ Π:sub:i=1¹ p(i) = inv(Π:sub:i=2² p(i))

TODO

Theorem A.6.5 P ⊆ RS

TODO