javascript remove comments safely and correctly  [ 721 views ]

Goal: remove the code comments correctly

One part of my javascript minimize process is to remove the comments from the code. As we know the there are two comment types in js:

// single line comment
  multiline comment

Here is a simple c# code to drop the comments.

  string ret = Regex.Replace(code, @"(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|(//.*)", string.Empty, RegexOptions.Multiline);

This code is seemingly correct but there are some special cases like the followings when it will fail:

  var t = 'this is a simple comment: //here is';
  /* the result will be incorrect! */
  var t = 'this is a simple comment: 

So this code contains a dangerous fake comment. I want to keep that ofcourse. My safe code is:

  var re = @"(@(?:""[^""]*"")+|""(?:[^""\n\\]+|\\.)*""|'(?:[^'\n\\]+|\\.)*')|//.*|/\*(?s:.*?)\*/";
  string ret = Regex.Replace(code, re, "$1");

This code will keep everything inside marked strings "./* fake comment */..", '. // fake comment ..' and this is the expected behavior.

#sidebar a { color:#fff; } #sidebar ul ul li { color: #DEF585; } #sidebar h2 { color: #fff; } #sidebar ul p, #sidebar ul select { color: #BEDDBE; } #backfly { background: url(images/golfBallWallPaper.jpg) left bottom fixed repeat-x #65a51d; }