I believe the ChatGPT code has a bug, in that it accepts three spaces or tabs before a code fence, while the Google Markdown spec says up to three spaces, and does not allow a tab there.
I also see that the tests generated by ChatGPT are far too few for the code features implemented. The cannot be the result of actual red/green TDD where the test comes before the feature is added.
For examples, 1) the code allows "~~~" but only tests for "```", 2) there are no tests when len(fence) < fence_len nor when len(fence) > fence_len, and 3) there are no tests for leading spaces.
There's also duplicate code. The function _strip_closing_hashes is used once, in the line:
text = _strip_closing_hashes(m.group("text")).strip()
The function is: def _strip_closing_hashes(s: str) -> str:
s = s.rstrip()
# remove trailing " ###" style closers
s = re.sub(r"[ \t]+#+\s*$", "", s).rstrip()
return s
The ".rstrip()" is unneeded as the ".strip()" does both lstrip and rstrip.I think that rstrip() should be replaced with a strip(), the function renamed to "_get_inline_content", and used as "text = _get_inline_content(m.group("text")).
Also, the Google spec also says "A sequence of # characters with anything but spaces following it is not a closing sequence, but counts as part of the contents of the heading:" so is it really correct to use "\s*" in that regex, instead of "[ ]*"? And does it matter, since the input was rstrip'ped already?
So perhaps:
def _get_inline_content(s: str) -> str:
s = s.rstrip(" ") # remove trailing spaces
s = s.rstrip("#") # removing "#" style closers
return s.strip() # remove leading and trailing whitespace
would be more correct, readable, and maintainable?
100%. That's why if you want good code you need to pay attention to what it's writing and testing and throw feedback like that at it.