A Microsoft KB Article–Example of How not to write code
Couple of weeks back I was doing some parsing of xml string and want to replace the special characters in the data. I didn’t remember the list of special characters, like others I fire up a search in Google. One of the google search result leads to a Microsoft KB article. I got what I wanted, list of special character I have to deal with. The article also provided a special character replacement logic, to my surprise the code was very lengthy and looks odd to me.
I decided to scan the code in little more detail. Below is the code from the article.
Dim filepath As String = "C:\customers.xml"
Private Sub ReplaceSpecialChars(ByVal linenumber As Long)
Dim strm As StreamReader
Dim strline As String
Dim strreplace As String
Dim tempfile As String = "C:\temp.xml"
Try
FileCopy(filepath, tempfile)
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
Dim strmwriter As New StreamWriter(filepath)
strmwriter.AutoFlush = True
strm = New StreamReader(tempfile)
Dim i As Long = 0
While i < linenumber - 1
strline = strm.ReadLine
strmwriter.WriteLine(strline)
i = i + 1
End While
strline = strm.ReadLine
Dim lineposition As Int32
lineposition = InStr(strline, "&")
If lineposition > 0 Then
strreplace = "&"
Else
lineposition = InStr(2, strline, "<")
If lineposition > 0 Then
strreplace = "<"
End If
End If
strline = Mid(strline, 1, lineposition - 1) + strreplace + Mid(strline, lineposition + 1)
strmwriter.WriteLine(strline)
strline = strm.ReadToEnd
strmwriter.WriteLine(strline)
strm.Close()
strm = Nothing
strmwriter.Flush()
strmwriter.Close()
strmwriter = Nothing
End Sub
Public Function LoadXMLDoc() As XmlDocument
Dim xdoc As XmlDocument
Dim lnum As Long
Dim pos As Long
Dim Newxml As String
Try
xdoc = New XmlDocument()
xdoc.Load(filepath)
Catch ex As XmlException
MessageBox.Show(ex.Message)
lnum = ex.LineNumber
ReplaceSpecialChars(lnum)
xdoc = LoadXMLDoc()
End Try
Return (xdoc)
End Function
The user calls the LoadXMLDoc() function, the function loads the xml file located in the disk. The weird thing is, the function does a recursive call in the catch block until all the special characters are replaced. Have a look at the line in red.
Let’s go to the ReplaceSpecialChars() function. It creates a temp copy of the original xml file and does the character replacement with several number of IO operation. Think a scenario where we need to deal with an xml data that has some thousand line and each line has special character, we end up doing IO operation thousands of time. Wow what a logic.
Refactored Approach
I rewrote the logic, I used C# as my choice. My approach is simple, read the xml file then format it the way we need it and load it to XMLDocument. Let see my piece of code.
public XmlDocument LoadXMLDoc() { XmlDocument doc = new XmlDocument(); string xmlToLoad = this.ParseXMLFile(); doc.LoadXml(xmlToLoad); return doc; } private string ParseXMLFile() { StringBuilder formatedXML = new StringBuilder(); StreamReader xmlReader = new StreamReader("CustomerData.xml"); while (xmlReader.Peek() >= 0) formatedXML.Append(this.ReplaceSpecialChars(xmlReader.ReadLine())); return formatedXML.ToString(); } private string ReplaceSpecialChars(string xmlData) { int grtrPosAt = xmlData.IndexOf(">"); int closePosAt = xmlData.IndexOf("</"); int lenthToReplace = 0; if (grtrPosAt>closePosAt) return xmlData; lenthToReplace= (closePosAt<=0 && grtrPosAt<=0)?xmlData.Length:(closePosAt-grtrPosAt)-1; //get the string between xml element. e.g. <ContactName>Hanna Moos</ContactName>,
//you will get 'Hanna Moos' string data = xmlData.Substring(grtrPosAt + 1, lenthToReplace); string formattedData = data.Replace("&", "&").Replace("<", "<")
.Replace(">", ">").Replace("'", "'"); xmlData = xmlData.Replace(data, formattedData); return xmlData; }
I took reactive approach rather than waiting the XMLDocument to throw error and rectify it then try loading. ReplaceSpecialChars() function formats the xml string and return it. I spend just 10 minute to come up with this function, so may not be the best one but definitely better than what mentioned in KB article.
Point to note In my approach
- The IO operation is only once, first time to read the xml file.
- XMLDocument will get a clean formatted xml to do it’s operation.
- No recursive call in catch block.
- LOC is less, so easy to understand.
If you find any better approach, update that in the comment section.
Note: I send a feedback to Microsoft couple of weeks back about the issues in the code but no response yet.
Happy coding …..

Leave a comment