Need Help with C# Reviews, for MS-SQL Support!

Bill Barry after.fallout at gmail.com
Wed Apr 7 17:49:33 UTC 2010


I'd be happy to help out.
Should Bugzilla style conventions be used for this project or standard C# ones?

Ignoring any of the style issues there are actual code issues I am seeing. 
I don't see any actual security issues here, but there could very well be race conditions (which could result in bad/missing data). 
There will be people who have issues with using managed functions in general, but I don't see a way around it particularly for the concat aggregate and the regex search (everything else I think is doable in raw sql functions; though I suppose a regex engine is possible there too, I wouldn't want to try writing it).


Anyway, assuming standard (mostly because I don't feel comfortable enough with the Bugzilla ones since not having done any bz or perl coding in quite a while now; I am pretty sure other things will need to change if it is to be Bugzilla):

Bugzilla.MSSQL.pidb
Bugzilla.MSSQL.suo
Bugzilla.MSSQL.userprefs

these files should be ignored, not added (IDE/compiler generated)

into actual changes:
AssemblyInfo.cs
===============
+using System.Runtime.CompilerServices;

this line is unused in the current file (hence should not be there)

+[assembly: AssemblyTitle("Bugzilla.MSSQL")]

Nit: Standard C# naming practices would have this be Bugzilla.Mssql (or perhaps MSSql, but eww...; mssql is usually referred to in .NET as SqlServer though, don't know if that should be given any weight or not)
see: http://msdn.microsoft.com/en-us/library/ms229043.aspx
quote: Do capitalize only the first character of acronyms
with three or more characters, except the first word of a camel-cased
identifier

overall assemblyinfo.cs nits:
1. Every bit of information in this file should be filled in
2. The following addititional code should be in assemblyinfo.cs as standard practice:
using System.Runtime.InteropServices;

[assembly: AssemblyFileVersion("1.0.*")]

// Setting ComVisible to false makes the types in this assembly not visible
// to COM components.  If you need to access a type in this assembly from
// COM, set the ComVisible attribute to true on that type.
[assembly: ComVisible(false)]

// The following GUID is for the ID of the typelib if this project is exposed to COM (create a new GUID for this)
[assembly: Guid("f06345e0-a4d4-47d8-99fc-5af3232233cf")]

(reasoning: it is far better to be explicit up front than to add it in later on)

Bugzilla.MSSQL.cs
=================
Filename should be Bugzilla.Mssql.cs

+using System.Data;
+using System.Data.SqlClient;

unused, shouldn't be there

+using System.Text;

unused, shouldn't be there

+using NaturalComparer;

unused, shouldn't be there

+public partial class UserDefinedFunctions

This class is not partial, it is fully defined. This class should be in a Bugzilla.Mssql namespace. This class should be marked static.

+    [Microsoft.SqlServer.Server.SqlFunction()]

Namespace is unnecessary (included in using statements), used multiple times in the file, will not repeat myself over and over again

+    public static SqlBoolean REGEXP([SqlFacet(MaxSize = -1)]SqlString subject, [SqlFacet(MaxSize = -1)]SqlString pattern, SqlBoolean ignorecase)

Function name is not in line with naming conventions for C# function names, Should be Regexp
nit: In .NET regexes are of the type Regex, not Regexp. Should the function name match?

+        string str_subject = subject.Value;
+        string str_pattern = pattern.Value;

names are nonstandard, should be strSubject and strPattern, though you should avoid any form of Hungarian notation. Better names would be subjectValue and patternValue.

+            test = System.Text.RegularExpressions.Regex.IsMatch(str_subject, str_pattern,RegexOptions.IgnoreCase);

namespace is unnecessary (included in using statements)

+            test = System.Text.RegularExpressions.Regex.IsMatch(str_subject, str_pattern);

namespace is unnecessary (included in using statements)

+        if (test)
+        {
+            return SqlBoolean.True;
+        }
+        else
+        {
+            return SqlBoolean.False;
+        }

nit: stylistically I do not like this code; better would be to pull out the Regexp function to:

private static bool RegexpImpl(SqlString subject, SqlString pattern, SqlBoolean ignorecase)

then replace the implementation of Regexp with:

return ToSqlBoolean(RegexpImpl(subject, pattern, ignorecase));

and have a function ToSqlBoolean:

private static SqlBoolean ToSqlBoolean(bool test)
{
    return test ? SqlBoolean.True : SqlBoolean.False;
}

This would allow you to also replace the contents of NOT_REGEX with

return ToSqlBoolean(!RegexpImpl(subject, pattern, ignorecase));

cutting the file shorter by about 15 lines and making the methods slightly more optimized.

+    public static SqlBoolean NOT_REGEXP([SqlFacet(MaxSize = -1)]SqlString subject, [SqlFacet(MaxSize = -1)]SqlString pattern, SqlBoolean ignorecase)

Function name should be NotRegexp

+    public static SqlInt32 INSTR([SqlFacet(MaxSize = -1)]SqlString find, [SqlFacet(MaxSize = -1)]SqlString inthis, SqlBoolean CaseSensitive)

Function name should be InStr
parameter inthis should be inThis
parameter CaseSensitive should be caseSensitive

nit: Why does this one have caseSensitive and the regex functions have ignorecase? The parameters should be consistent with each other as the current method could be confusing and eventually lead to issues.

+        if (ind == -1)
+        {
+            return 0;
+        }
+        else {
+            return ind+1;
+        }

condition is unnecessary, should be:

+        return ind+1;

empty line at the end of this method should be removed

+    [Microsoft.SqlServer.Server.SqlFunction()]
+    public static SqlDateTime NOW()
+    {
+
+        return DateTime.Now;
+    }
+
+    [Microsoft.SqlServer.Server.SqlFunction()]
+    public static SqlDateTime LOCALTIMESTAMP()
+    {
+        return NOW();
+    }

Aren't both of these doing the same thing that getdate() does?
Again nonstandard names (will stop saying this now for future methods)

nit: empty line at beginning of NOW() (empty lines at the beginning and end of methods are quite pointless and IMO can only possibly serve to be an attempt to get the reader to disassociate the method name from its implementation or to make syntax errors when modifying and I decree that this is a bad thing in all cases)

+    public static SqlInt32 TO_DAYS(SqlDateTime date)
+    {
+        DateTime SubjectDate = date.Value;
+        DateTime Y1900 = new DateTime(1900, 1, 1);
+
+        long elapsedTicks = SubjectDate.Ticks - Y1900.Ticks;
+        TimeSpan elapsedSpan = new TimeSpan(elapsedTicks);
+        return Convert.ToInt32(elapsedSpan.TotalDays);
+
+    }

1. Shouldn't you be checking if date is null?
2. Why are you subtracting ticks like that? Can't you just do
+        return Convert.ToInt32((SubjectDate - Y1900).TotalDays);
3. variable names (should be camelCase), will not repeat again

+        Y1900.AddDays(days.Value);

shouldn't be here, statement has no side affect and is repeated anyway on the next line

+        if (SqlDate.IsNull) return SqlChars.Null;
+        string str_format = format.Value;
+        DateTime Date = SqlDate.Value;
+
+        string a = Date.DayOfWeek.ToString();

if statement uses different syntax than before; should be consistent
variable "a" is unused

As for the if statements: most of the time spent inside string.Replace is spent doing the same thing that string.Contains does. Both take far longer than evaluating whatever the replacement would be.
I would replace that whole section (from the first if to the return statement) with:
        return new SqlChars(str_format
            .Replace(@"%a", Date.DayOfWeek.ToString().Substring(0, 3))
            .Replace(@"%b", ((ShortMonthOfYear) Date.Month).ToString())
            .Replace(@"%c", Date.Month.ToString())
            .Replace(@"%D", Date.Day.ToString() + GetDateSuffix(Date))
            .Replace(@"%d", Date.Day.ToString().PadLeft(2, '0'))
            .Replace(@"%e", Date.Day.ToString())
            .Replace(@"%f", Date.Millisecond.ToString().PadRight(6, '0'))
            .Replace(@"%H", Date.Hour.ToString().PadLeft(2, '0'))
            .Replace(@"%h", Date.Hour.ToString().PadLeft(2, '0'))
            .Replace(@"%I", Date.Hour.ToString().PadLeft(2, '0'))
            .Replace(@"%i", Date.Minute.ToString().PadLeft(2, '0'))
            .Replace(@"%j", Date.DayOfYear.ToString().PadLeft(3, '0'))
            .Replace(@"%k", Date.Hour.ToString().PadLeft(2, '0'))
            .Replace(@"%l", Date.Hour.ToString())
            .Replace(@"%M", ((LongMonthOfYear) Date.Month).ToString())
            .Replace(@"%m", Date.Month.ToString().PadLeft(2, '0'))
            .Replace(@"%p", Date.Hour >= 12 ? @"PM" : @"AM")
            .Replace(@"%r", Date.TimeOfDay.ToString())
            .Replace(@"%S", Date.Second.ToString().PadLeft(2, '0'))
            .Replace(@"%s", Date.Second.ToString().PadLeft(2, '0'))
            .Replace(@"%T", Date.ToShortTimeString().ToString())
            .Replace(@"%U", WeekOfYear(Date).ToString())
            .Replace(@"%u", WeekOfYear(Date).ToString())
            .Replace(@"%V", WeekOfYear(Date).ToString())
            .Replace(@"%v", WeekOfYear(Date).ToString())
            .Replace(@"%W", Date.DayOfWeek.ToString())
            .Replace(@"%w", ((LongDayofWeek) Date.DayOfWeek).ToString())
            .Replace(@"%X", Date.Year.ToString())
            .Replace(@"%x", Date.Year.ToString())
            .Replace(@"%Y", Date.Year.ToString())
            .Replace(@"%y", Date.Year.ToString().Substring(2, 2))
            .Replace(@"%%", @"%"));

That said, I think there are some bugs here:
I doubt that %H, %h, %I and %k are supposed to return the same value (similarly U,u,V,v, and X,x,Y).
Is %f correct or should it also be PadLeft?
You have an if for capital S and then do a replace with lowercase s.
Could a test be written to ensure the correctness of this function?
Are there l10n issues here?

+    [Serializable]
+    private enum LongDayofWeek
...
+
+    [Serializable]
+    private enum ShortMonthOfYear
...
+
+    [Serializable]
+    private enum LongMonthOfYear

These should be public to prevent a future compiler from obfuciating them.

+    private static string GetDateSuffix(SqlDateTime SqlDate)
+    {
+        if (SqlDate.IsNull) return null;
+
+        DateTime date = SqlDate.Value;
+
+        string day = date.Day.ToString();
+        if (day.EndsWith("1"))
+        {
+            return day.StartsWith("1") && date.Day != 1 ? "th" : "st";
+        }
+        else if (day.EndsWith("2"))
+        {
+            return day.StartsWith("1") ? "th" : "nd";
+        }
+        else if (day.EndsWith("3"))
+        {
+            return day.StartsWith("1") ? "th" : "rd";
+        }
+        return "th";
+    }

parameter being passed in is DateTime, not SqlDateTime (makes first if and assignment unnecessary)
first if syntax inconsistent with second if syntax
else keywords are unneeded

+    private static int WeekOfYear(SqlDateTime SqlDate)
+    {
+        if (SqlDate.IsNull) return 0;
+
+        DateTime date = SqlDate.Value;
+
+        System.Globalization.CultureInfo ci = System.Threading.Thread.CurrentThread.CurrentCulture;
+        System.Globalization.Calendar cal = ci.Calendar;
+        System.Globalization.CalendarWeekRule cwr = ci.DateTimeFormat.CalendarWeekRule;
+        DayOfWeek fdow = ci.DateTimeFormat.FirstDayOfWeek;
+        return cal.GetWeekOfYear(date, cwr, fdow);
+    }

parameter being passed in is DateTime, not SqlDateTime (makes first if and assignment unnecessary)
if syntax inconsistant again
using CurrentThread.CurrentCulture here, is this ok in sql app domain context (I think so but I am not certain)?
You use the current culture here but previously you used InvariantCulture for case sensitivity. Does that present an internationalization issue (how does mysql do case sensitivity particularly for accented characters? does it vary per culture and if so should this or should it do it however mssql does case sensitivity when it is configured as insensitive, if they are different?)?

+public struct GROUP_CONCAT : Microsoft.SqlServer.Server.IBinarySerialize

should be in separate file, should be in Bugzilla.Mssql namespace, should be named GroupConcat
This class feels smelly:
1. A List<T> datatype is almost always better than an ArrayList (and in worst case is equivalent)
2. this keyword is not necessary for accessing your own members unless they share a name with something else you are trying to use
3. Read is not a compliment to Write
4. Lists are not thread safe. I wouldn't doubt there may be race conditions in this code.

I would write this class as follows (completely untested code, but it looks less buggy I think):
public struct GROUP_CONCAT : Microsoft.SqlServer.Server.IBinarySerialize {
    private static readonly object lockobj = new object();
    private List<string> _remembered;
    private List<GROUP_CONCAT> _merged;
    private string _seperator;
    private string _sortOrder;
    public void Init() {
        _remembered = new List<string>();
        _merged = new List<GROUP_CONCAT>();
        _seperator = null;
        _sortOrder = null;
    }
    public void Accumulate([SqlFacet(MaxSize = -1)]SqlString value, [SqlFacet(MaxSize = -1, IsNullable = true)]SqlString seperator, SqlString sortOrder) {
        // Default a Null to Seperator to an empty string
        _seperator = seperator.IsNull ? "" : seperator.Value;

        // Add the Seperator as the first array element
        _sortOrder = sortOrder.IsNull ? "NTRL" : sortOrder.Value.ToUpper();

        if (_sortOrder != "ASC" && _sortOrder != "DESC" && _sortOrder != "NTRL") {
            throw new ArgumentException(sortOrder.Value + " is not valid for Sort Order");
        }

        if (value.IsNull) return;

        lock (lockobj) {
            _remembered.Add(value.Value);
        }
    }
    public void Merge(GROUP_CONCAT group) {
        if(group._seperator!=_seperator) {
            throw new ArgumentException("Seperators do not match: " + group._seperator+", "+_seperator);
        }
        if(group._sortOrder!=_sortOrder) {
            throw new ArgumentException("Sort Orders do not match: " + group._sortOrder + ", " + _sortOrder);
        }

        //not sure about the thread safety of Add/AddRange so...
        lock (lockobj) {
            _merged.Add(group);
        }
    }

    private IEnumerable<string> Enumerate() {
        lock (lockobj) {
            foreach (var value in _remembered) {
                yield return value;
            }
            foreach (var list in _merged) {
                foreach (var value in list.Enumerate()) {
                    yield return value;
                }
            }
        }
    }

    public SqlString Terminate() {
        List<string> all = new List<string>(Enumerate());

        // If ASC or DESC then Sort
        //new NaturalComparer(NaturalComparerOptions.RomanNumbers));
        if (_sortOrder != "NTRL") all.Sort(new NaturalComparer.NaturalComparer());
        // If DESC then reverse the Sort order
        if (_sortOrder == "DESC") all.Reverse();
        return new SqlString(string.Join(_seperator, all.ToArray()));
    }

    public void Read(System.IO.BinaryReader r) {
        int itemCount = r.ReadInt32();

        // Retrieve the Seperator for use by Terminate()
        _seperator = r.ReadString();
        _sortOrder = r.ReadString();
        lock (lockobj) {
            _remembered = new List<string>(itemCount - 2);
        }
        for (int i = 2; i <= itemCount - 1; i++) {
            lock (lockobj) {
                _remembered.Add(r.ReadString());
            }
        }
    }

    public void Write(System.IO.BinaryWriter w) {
        List<string> all = new List<string>(Enumerate());
        w.Write(all.Count);
        w.Write(_seperator);
        w.Write(_sortOrder);
        foreach (string s in all) {
            w.Write(s);
        }
    }
}

Bugzilla.MSSQL.csproj
=====================
+    <RootNamespace>Bugzilla.MSSQL</RootNamespace>
+    <AssemblyName>Bugzilla.MSSQL</AssemblyName>
+    <TargetFrameworkVersion>v3.5</TargetFrameworkVersion>

Names again...
Should this be targeting .net 3.5? or should it target 2.0 (for a potentially wider audience)? It doesn't appear to be using any 3.5 assemblies or syntax (unless you use the GROUP_CONCAT class I provided above, in which case you would have to remove the 3.5 var syntax).

NaturalSortComparer.cs
======================
Assuming this code is correct (not even going to bother going there at least yet) my most significant issue is that the variable names are not very explicit. In my company we use _camelCase for fields, camelCase for method variables and parameters and PascalCase for just about everything visible outside of the class.
These classes should also be in a Bugzilla.Mssql namespace and they should be in separate files per class (at least according to C# conventions that I know of, perhaps the bugzilla devs do not care to follow this).

I am finding myself questioning what the whole point of this class is.
What advantages does it provide that Bugzilla might need that will permit it to waste so much time (parsing strings) as part of the sql execution?




Max Kanat-Alexander wrote:
> 	Hey everybody! So, I've started to review the contributed patches for
> MS-SQL support, and it's coming along, but there's one problem! Some of
> the patches are in C#, and although I can read it, I don't have enough
> experience writing C# for MS-SQL to feel that I can confidently review
> the contribution. So, I'd really like somebody from the community (or
> perhaps many people) who have some C# experience to take a look at the
> patch on this bug:
>
> 	https://bugzilla.mozilla.org/show_bug.cgi?id=487437
>
> 	At the end is a bunch of C# and Visual Studio project files, and I'd
> really like somebody to take a look at it for code clarity,
> implementation, and security. Even if the existing patch there already
> has "review-" on it, don't fear--that's just me reviewing the Perl code;
> I'm not touching the C# stuff.
>
> 	If you'd really like to see MS-SQL support (or if you just happen to
> have some C# experience and would like to contribute to Bugzilla), this
> is the way to help! :-)
>
> 	-Max
>   




More information about the developers mailing list