logoalt Hacker News

eesmithtoday at 9:10 AM1 replyview on HN

I believe the ChatGPT code has a bug, in that it accepts three spaces or tabs before a code fence, while the Google Markdown spec says up to three spaces, and does not allow a tab there.

I also see that the tests generated by ChatGPT are far too few for the code features implemented. The cannot be the result of actual red/green TDD where the test comes before the feature is added.

For examples, 1) the code allows "~~~" but only tests for "```", 2) there are no tests when len(fence) < fence_len nor when len(fence) > fence_len, and 3) there are no tests for leading spaces.

There's also duplicate code. The function _strip_closing_hashes is used once, in the line:

  text = _strip_closing_hashes(m.group("text")).strip()
The function is:

  def _strip_closing_hashes(s: str) -> str:
      s = s.rstrip()
      # remove trailing " ###" style closers
      s = re.sub(r"[ \t]+#+\s*$", "", s).rstrip()
      return s
The ".rstrip()" is unneeded as the ".strip()" does both lstrip and rstrip.

I think that rstrip() should be replaced with a strip(), the function renamed to "_get_inline_content", and used as "text = _get_inline_content(m.group("text")).

Also, the Google spec also says "A sequence of # characters with anything but spaces following it is not a closing sequence, but counts as part of the contents of the heading:" so is it really correct to use "\s*" in that regex, instead of "[ ]*"? And does it matter, since the input was rstrip'ped already?

So perhaps:

  def _get_inline_content(s: str) -> str:
      s = s.rstrip(" ") # remove trailing spaces
      s = s.rstrip("#") # removing "#" style closers
      return s.strip() # remove leading and trailing whitespace
would be more correct, readable, and maintainable?

Replies

simonwtoday at 11:44 AM

100%. That's why if you want good code you need to pay attention to what it's writing and testing and throw feedback like that at it.

show 1 reply