OU blog

Personal Blogs

Richard Walker

Elementary my dear WAtSON

Visible to anyone in the world

As a wordplay enthusiast I've often mused over what words can be formed by combining the symbols of the chemical elements.

For example, Actinium-Actinium-Iodine-Arsenic would give "acacias" and Barium-Oxygen-Barium-Boron-Sulphur would give "Baobabs".

A while ago I did try to come up with some  examples by hand but only found a few, so I decided to write a little program in Python.(see below[*]). For simplicity I just used lowercase letters e.g. "h", "he", "li" etc. but putting back the capitals would not be too hard. I also excluded the artificial transuranic elements (26 found to date) but again adding them would be easy enough. My reason for omitting them is that their name and symbols are a bit less familiar.

Also note it uses a "greedy" algorithm and if the first two letters is a valid symbol it chooses that first, so it can't handle "those", because it finds "th" (Thorium), then "os" (Osmium), and now the "e' is orphaned. To overcome this I'd need to add some backtracking capability so the program gets a bit more complex.

I analysed an open-source word list of 113,810 words and found 13,435 - let's call them "elementary" - words, about 12%. We'd expect the chances of a word being elementary to fall off as words gets longer, because only 92 one- or two-letter combinations of letters are symbols of chemical elements but the total number of possible one- or two-letter combination is 702, so only about 1 in 8 correspond to elements.

Nonetheless I found some surprisingly examples, with the champion being the 16-letter "counteraccusations". Here are some other long ones.

acacias
acarpous
accepters
accessions
accountancy
accurateness
accuratenesses
articulatenesses
counteraccusation

"counteraccusations" represents Cobalt Uranium Nitrogen Tellurium Radium Carbon Copper (Cu) Samarium Titanium Oxygen Nitrogen Sulphur; 11 distinct elements which I imagine is a record.

What about whole sentences? It's not easy to make up natural-sounding ones, but here is my attempt at one from a scientific setting. I've added proper punctuation to make it more realistic.

"Pop both new boxes in lab one Gabby, ta."

[*] Program follows: list of symbols, then function to test words  then sample function calls.

elements = [
    "h", "he", "li", "be", "b", "c", "n", "o", "f", "ne",
    "na", "mg", "al", "si", "p", "s", "cl", "ar", "k", "ca",
    "sc", "ti", "v", "cr", "mn", "fe", "co", "ni", "cu", "zn",
    "ga", "ge", "as", "se", "br", "kr", "rb", "sr", "y", "zr",
    "nb", "mo", "tc", "ru", "rh", "pd", "ag", "cd", "in", "sn",
    "sb", "te", "i", "xe", "cs", "ba", "la", "ce", "pr", "nd",
    "pm", "sm", "eu", "gd", "tb", "dy", "ho", "er", "tm", "yb",
    "lu", "hf", "ta", "w", "re", "os", "ir", "pt", "au", "hg",
    "tl", "pb", "bi", "po", "at", "rn", "fr", "ra", "ac", "th",
    "pa", "u"
]

def parse(word):

    result = []

    while len(word) > 0:

        token1 = word[0:1]
        token2 = word[0:2]

        # Look for two-letter symbol first
        if token2 in elements:
            result.append(token2)
            word = word[2:]

        # If not check for 1--letter symbol 
        elif token1 in elements:
            result.append(token1)
            word = word[1:]

        else:
            return 'fail'
        
    return result

print(parse('cream'))
print(parse('scone'))
Permalink
Share post

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 3117170