Bugzilla 5.1.2+ confuses Markdown code blocks in multiple comments

Ivan Krylov ikrylov at disroot.org
Mon Feb 19 10:57:55 UTC 2024


Hello everyone,

The R project is currently running Bugzilla 5.1.2+, which seems to
correspond to one of the three following commits (as visible on
https://github.com/bugzilla/bugzilla):

450ce30760f7027a443872fa7d44616c421dee54
425158e8c96c9c4ebf43bd2628864b392c78ca76
f82dde7c0f3ffbe29a435d5d10dcdeeb3c024857

Sometimes, Bugzilla's Markdown renderer emits wrong code blocks (from
different comments), or even plain U+F111 characters instead of code
blocks. Here is a prominent example: the XML dump at
https://bugs.r-project.org/show_bug.cgi?ctype=xml&id=16158 shows
different code block contents from what can be seen at
https://bugs.r-project.org/show_bug.cgi?id=16158.

I think this boils down to the Bugzilla::Markdown object caching code
blocks in $self->_(indented_)?code_blocks and then the rest of the
rendering process reusing the cached Markdown object in
Bugzilla->markdown. I can reproduce the same output manually if I call
$markdown->markdown() on the same object with different comments:

$ wget -O 16158.xml \
 "https://bugs.r-project.org/show_bug.cgi?ctype=xml&id=16158"
$ perl \
 -I. -MBugzilla -MBugzilla::Attachment -MBugzilla::Markdown \
 -MData::Dumper -MCarp::Always -MXML::Twig -E'
  my @text = XML::Twig::->new()->parsefile("16158.xml")
   ->get_xpath("*/long_desc/thetext");
  my $markdown = Bugzilla::Markdown::->new();
  for my $t (@text) {
   say $markdown->markdown($t->text);
  }
' | less

(The output is wrong in the same manner as on the R Bugzilla website.)

One way to fix the problem is to flush the code block cache every time
a new Markdown chunk is being processed:

From ff056c330b81a2dbc691f35121b824fc1047b248 Mon Sep 17 00:00:00 2001
From: Ivan Krylov <ikrylov at disroot.org>
Date: Mon, 19 Feb 2024 13:43:07 +0300
Subject: [PATCH] Markdown: clear cached code blocks to allow reuse

Previously, using the same Bugzilla->markdown to render multiple
comments could substitute wrong code blocks due to them being retrieved
from the cache.
---
 Bugzilla/Markdown.pm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Bugzilla/Markdown.pm b/Bugzilla/Markdown.pm
index a0c8c671b..eb7894422 100644
--- a/Bugzilla/Markdown.pm
+++ b/Bugzilla/Markdown.pm
@@ -128,6 +128,10 @@ sub _RunSpanGamut {
 # processing markdown structures.
 sub _removeFencedCodeBlocks {
   my ($self, $text) = @_;
+  # First clear the cached code blocks to avoid problems when the
+  # Markdown object is reused.
+  @{$self->_code_blocks} = ();
+  @{$self->_indented_code_blocks} = ();
   $text =~ s{
         ^ `{3,} [\s\t]* \n
         (                # $1 = the entire code block
-- 
2.39.2

With the patch applied, the output of the one-liner above matches my
expectations. Please let me know if there is anything else I could do
to help fix this problem.

-- 
Best regards,
Ivan


More information about the support-list mailing list