INCLUDE_DATA
Face it, C# is not as regex-friendly as it could be. To do things like find a single regex match involves several lines or some nasty inline code. Using Extension methods, this can be simplified. In another addition to StatenUtil focused on regex, C# becomes a more coder-friendly language.
using System; using System.Collections.Generic; using System.Text.RegularExpressions; namespace StatenUtil.RegexExtensions { public static class StringRegex { public static string RegexReplace(this string source, string regex, string replacement) { Regex r = new Regex(regex); return r.Replace(source, replacement); } public static string Match(this string source, string regex) { Regex r = new Regex(regex); Match m = r.Match(source); return m.Value; } public static string[] Matches(this string source, string regex) { Regex r = new Regex(regex); MatchCollection mc = r.Matches(source); List<string> matches = new List<string>(); foreach(object o in mc) matches.Add(((Match)o).Value); return matches.ToArray(); } public static string RegexRemove(this string source, string regex) { Regex r = new Regex(regex); return r.Replace(source, ""); } } }
This class offers four extension methods to the string class: Replace, Remove, Match first, and Match all. Aside from RegexReplace, the parameters often call for a single regular expression string.
Another addition that C# could use is a string scanner. A string scanner moves through a string one regex at a time matching the beginning of a string. This is useful for parsing languages in the way similar to JSON Parsing in Ruby.
using System; namespace StatenUtil.RegexExtensions { public class StringScanner { public string Remaining { get; set; } public string Match { get; set; } public StringScanner(string input) { Remaining = input; } public string Scan(string regex) { Match = Remaining.Match("^"+regex); if (Match != null) Remaining = Remaining.Substring(Match.Length); return Match; } } }
The StringScanner class constructor takes an input string that it will scan. Having only a Scan method, the class has the single use of moving through the input string as regex matches are made. As Scan finds a match, it returns the matching string. Also, the class has two properties: Remaining and Match. Remaining provides what is left in the string, useful for checking if the end of the input string has been met. Match provides the most recent match of the Scan method.
Yesterday, I spent awhile working on a RubyQuiz Challenge involving the parsing of JSON. Although I did create it with minimal viewing of the solution, I cannot take credit for the code or the idea. I did notice that on the site there was no actual merging of their snippets, so I felt inclined to share.
require "strscan" class JSONParser AST = Struct.new(:value) def parse(input) @input = StringScanner.new(input) parse_value.value ensure @input.eos? or error("Unexpected data") end private def parse_value trim_space parse_object or parse_array or parse_string or parse_number or parse_keyword or error("Invalid Data") ensure trim_space end #Parses colon separated object hashes def parse_object if @input.scan(/\s*\{/) obj = Hash.new more_parts = false while key = parse_string @input.scan(/\s*:\s*/) or error("Expected : separator") obj[key.value] = parse_value.value more_parts = @input.scan(/\s*,\s*/) or break end error("Missing object pair") if more_parts @input.scan(/\s*}/) or error("Unclosed object") AST.new(obj) else false end end def parse_array if @input.scan(/\s*\[/) #Arrays start with [ a = Array.new more_items = false while current = parse_value a << current.value more_items = @input.scan(/\s*,\s*/) or break end error("Missing value") if more_items @input.scan(/s*\]/) or error("Unclosed array") AST.new(a) end end #Parses string objects def parse_string if @input.scan(/"/) s = String.new while current = parse_string_content || parse_string_escape s << current.value end @input.scan(/"/) or error('Unclosed String') AST.new(s) else false end end def parse_string_content @input.scan(/[^\\"]+/) and AST.new(@input.matched) end def parse_string_escape if @input.scan(%r{\\["\\/]}) #slashes and quotations AST.new(@input.matched[-1]) elsif @input.scan(/\\[bfnrt]/) #newlines, tabs, etc AST.new(eval(%Q{"#{@input.matched}"})) elsif @input.scan(/\\u[0-9a-fA-F]{4}/) #Hex integers AST.new([Integer("0x#{@input.matched[2..-1]}")].pack("U")) else false end end def parse_number @input.scan(/-?(?:0|[1-9]\d*)(?:\.\d+)?(?:[eE][+-]?\d+)?\b/) and AST.new(eval(@input.matched)) end def parse_keyword @input.scan(/\b(?:true|false|null)\b/) and AST.new(eval(@input.matched.sub("null","nil"))) end def trim_space @input.scan(/\s+/) end def error(message) raise "#{message}: #{@input.peek(@input.string.length)}" end end
For an in depth explanation of the individual portions in this code, refer to the RubyQuiz Site . One way that I’ve tested this code is by accessing the Twitter search api. The following snippet will print the 20 most recent #neumont tweets:
require 'jsonParser' require 'cgi' require 'open-uri' url = "http://search.twitter.com/search.json?q=%23neumont" parser = JSONParser.new open(url) {|html| @output = html.read} obj = parser.parse(@output) obj["results"].each {|e| puts CGI.unescapeHTML(e["text"])}
For anyone who’s new to the blog, StatenUtil is a personal C# utility library. Consisting mainly of extension methods, its main objective is to simplify frequently used operations. This edition includes adding some Ruby-like looping methods into C#. I understand that Func and Action objects aren’t a real replacement for blocks. However, it’s nice to have some features I miss when not using Ruby.
using System; namespace StatenUtil { public static class Int32Extensions { public static void Times(this int source, Action<int> action) { for (int i = 0; i < source; i++) { action(i); } } public static void UpTo(this int source, int max, Action<int> action) { for (int i = source; i <= max; i++) { action(i); } } public static void DownTo(this int source, int min, Action<int> action) { for (int i = source; i >= min; i--) { action(i); } } public static int[] Range(this int source, int max) { int difference = max - source; if (difference < 0) return new int[0]; int[] ret = new int[difference + 1]; for (int i = 0; i <= difference; i++) ret[i] = source+i; return ret; } } }
This class file revolves around adding functions to the int class. The Times method allows for a specific integer capturing lambda expression to be executed source times. For example, 6.Times(x => Console.WriteLine(x)) would print lines from zero to 6.
UpTo and DownTo are similar to the Times method. However, they loop over a range of integers rather than from zero. Compare these to Ruby’s upto and downto methods.
Range is simply a way to create an array covering a range of integers. The source is the minimum value and max is the largest of the array.
using System; using System.Collections.Generic; namespace StatenUtil { public static class IEnumerableExtensions { public static void ForEach<T>(this IEnumerable<T> source, Action<T> action) { IEnumerator<T> enumerator = source.GetEnumerator(); while (enumerator.MoveNext()) action(enumerator.Current); } } }
Currently, this class file add inline ForEach capabilities to any generic IEnumerable object, similar to Ruby’s each. Although List has this capability, other IEnumerables like arrays do not.
That’s all for now. However, StatenUtil is a work in progress that’s always being amended. In each significant addition, the details will be shared once again.
Recently, I recieved a comment on one of my previous posts, Creating a Socket Server and Client in Ruby, that asked about the next step towards taking the simple example to make a working web server. Well, you’ve been heard and I have written up code to do just that. It handles retrieval and sending of headers, handling GET requests, and responding with status.
class RequestInfo attr_accessor :headers, :method, :resource def initialize @headers = {} end end class ResponseInfo attr_accessor :status, :headers, :body def initialize @headers = {} @body = String.new end end
These classes are short and are designed simply to contain the information coming in and going out of the webserver. RequestInfo’s method is used to obtain the method of the request (GET, etc..). Resource is the requested file from the connecting client. Finally, headers is a map of key value pairs corresponding with headers.
ResponseInfo consists of status, which is the first line returned containing 200 OK or 404 Not Found; body, the content of what’s actually responded with; and headers, the map of headers of the response.
request = RequestInfo.new response = ResponseInfo.new while((current = s.readline).chomp! != "") if current[0,3] == "GET" methodline = current.split request.method = methodline[0] request.resource = methodline[1] elsif current.match ':' headerline = current.split ':',2 request.headers[headerline[0]] = headerline[1] end end
This segment handles populating the RequestInfo object, by going through the sent request line by line from the TCPSocket connection s. It retrieves method, resources, and headers in a single while loop. The headers are retrieved by splitting on the first colon in each line that contains one.
begin File.open("public/#{request.resource}","r") do |file| while(curline = file.gets) response.body << curline end end response.status = "HTTP/1.0 200 OK" response.headers["Content-Type"] = "text/html" response.headers["Content-Length"] = response.body.length.to_s rescue => err puts err response.status = "HTTP/1.0 404 Not Found" response.headers["Content-Type"] = "text/html" response.headers["Content-Length"] = "0" response.body = "" end
Within this segment, we read in a file (looking in a folder called public so the webserver.rb itself cannot be retrieved through a request), appending its contents to the response.body. Afterwards, if all is successful, we respond with a 200 OK along with some other headers like Content-Type and Content-Length. Typically, if there’s an issue it’s because the file cannot be found, so returning a 404 is appropriate.
Note: A suggestion for improving this example is reading the MIME type of the read file so the Content-Type is not hardcoded.
s.write response.status + "\n" response.headers.each {|key,value| s.write "#{key}:#{value}\n"} s.write "\n" + response.body s.close
This writes to the socket connection s, in order, ResponseInfo’s status, headers, and body. Finally, the connection is closed.
class RequestInfo attr_accessor :headers, :method, :resource def initialize @headers = {} end end class ResponseInfo attr_accessor :status, :headers, :body def initialize @headers = {} @body = String.new end end require "socket" serv = TCPServer.new('jstaten.com',7881) loop do Thread.start(serv.accept) do |s| request = RequestInfo.new response = ResponseInfo.new while((current = s.readline).chomp! != "") if current[0,3] == "GET" methodline = current.split request.method = methodline[0] request.resource = methodline[1] elsif current.match ':' headerline = current.split ':',2 request.headers[headerline[0]] = headerline[1] end end if (request.resource == "/") request.resource = "index.html" end begin File.open("public/#{request.resource}","r") do |file| while(curline = file.gets) response.body << curline end end response.status = "HTTP/1.0 200 OK" response.headers["Content-Type"] = "text/html" response.headers["Content-Length"] = response.body.length.to_s rescue => err puts err response.status = "HTTP/1.0 404 Not Found" response.headers["Content-Type"] = "text/html" response.headers["Content-Length"] = "0" response.body = "" end s.write response.status + "\n" response.headers.each {|key,value| s.write "#{key}:#{value}\n"} s.write "\n" + response.body s.close end end end
When the steps are all implemented, you have a working ruby web server. Surely there are improvements to be made upon this creation, but it’s an effective proof-of-concept that can be built off.
I appreciate the feedback from all visitors to my blog, and if you feel that you have any questions or suggestions, feel free to comment away.
Note: Made a few corrections suggested. Thanks!
Remember the Java JAX-WS Web Service discussed what seems like long ago? Currently there is an have an instance of it running on this hosting server (got to love SSH). Experimenting with Ruby has led me to find the simplicity of creating a web service client. The duck typing of the language allows you to not need to generate a bunch of classes to pull in a WSDL (although it is possible using wsdl2r).
require 'soap/wsdlDriver' URL = 'http://jstaten.com:7070/MathExample/MathService?wsdl' begin driver = SOAP::WSDLDriverFactory.new(URL).create_rpc_driver p driver.addInts({:arg0 =>2, :arg1 => 4}).return rescue => err p err.message end
Steping through this code, you’re simply requiring the soap.wsdlDriver library and setting the URL of the WSDL to be read in. The begin statement is for error handling incase something goes wrong. Inside that, you create a new rpc driver using the WSDLDriverFactory. The “driver” within this code is the instance which contains methods pointing to the web service’s exposed operations. After creating the driver, we call a method on it. Because it requires two parameters (the two integers to be added), we pass it a map. The return field from running that method contains the value of the response. Finally, we rescue by printing the error message. If you’d like to test this against the web service in the code, feel free. Just please don’t abuse it.
Working through endless “Wrestles” in CS290, I’ve come to find some common needs for many of my assignments. Because of this, I’ve started to create my own utility DLL (dynamic link library) of functions. I’ll be honest that not all of the ideas were my own, especially Jamie King’s P() method.
For those who have not heard of or used extension methods, here’s a brief overview. Extension methods are simply static methods that are syntactic sugar. To create an extension method, create a static class and within it create a static method having this before your first argument.
public static class MyStaticClass { public static void MyExtensionMethod(this Object source) { } }
After creating this, you can then use it by calling it on any instance that inherits from Object (anything) as instanceName.MyExtensionMethod(). It’s the same as calling, MyStaticClass.MyExtensionMethod(instanceName), just shorter to type and implicitly inserting the parameter, making it feel like a method of the object, rather than a static extension.
Assertion is an effective way of debugging your code. When “wrestling” with new concepts, its a great way to prove things without spewing out a bunch of console.writeline(). These extension methods give you simple Assert methods to call on any object.
using System; using System.Diagnostics; namespace StatenUtil { public static class Asserter { public static void AssertReferenceEquals(this Object source, Object target) { Debug.Assert(Debug.ReferenceEquals(source, target), "Source and target references not equal"); } public static void AssertEquals(this Object source, Object target) { if (source == null) { Debug.Assert(target == null, "Source is null, target is not"); } else { Debug.Assert(source.Equals(target), "Objects are not equal"); } } public static void AssertNull(this Object source) { Debug.Assert(source == null, "Object is not null"); } } }
Another frequet debugging technique is to print out to the console. Adding the P() extension method can simplify the need to write Console.Writeline(). Also, added is the iteration of enumerable objects to eliminate the need to foreach over its members.
using System; using System.Collections; namespace StatenUtil { public static class PrintExtensions { public static void P(this Object source) { P(source, "{0}"); } public static void P(this Object source, string formatString) { if (!(source is String) && source is IEnumerable) { Console.WriteLine(formatString, source); foreach (object o in (IEnumerable)source) o.P(" " + formatString); return; } Console.WriteLine(formatString, source); } } }
In the future, I’ll continue to add additional features to this utility set and share them as the changes become significant. Check back often, as StatenUtil has plenty of life ahead of it. If you have comments or suggestions, please let me know.
Back with more Ruby, I bring you some database connectivity because that’s always an important thing to know about a language.
This tutorial requires that you have the mysql and dbi gems installed. If you have RubyGems installed, it should be as simple as
#gem install mysql #gem install dbi
However, if you’re having difficulty, surely Google understands.
Also, this requires a MySql database running on your local machine with a database containing a customer table as follows
CREATE TABLE `customer` ( `id` int(11) NOT NULL auto_increment, `name` varchar(30) NOT NULL, `telephone` char(10) default NULL, PRIMARY KEY (`id`), UNIQUE KEY `telephone` (`telephone`) ) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
require 'rubygems' gem 'dbi' require 'dbi' begin con = DBI.connect("DBI:Mysql:DATABASE_NAME:localhost","USER","PASSWORD") result = con.execute "SELECT * FROM customer" result.fetch_hash do |row| printf "ID:%d NAME:%s PHONE:%s\n", row["id"], row["name"], row["telephone"] end result.finish rescue DBI::DatabaseError => e puts "Error, #{e.err}, #{e.errstr}" ensure con.disconnect if con end
Breaking this down into some blocks, the program starts by requiring rubygems and dbi (gem ‘dbi’ allows the dbi libraries to be accessed). DBI is ruby’s interface to use several different databases. In this case we’ll be using MySQL.
The real work begins when we use DBI.connect to create a connection to a database. This connection is what we’ll use to send queries to the database. Be sure to replace DATABASE_NAME, USER, and PASSWORD with the appropriate information for your own database.
Next up, con.execute will return a result set that can be iterated over using fetch_hash. In this case we’ll simply print each result to the console. Notice that rows are hashes that allow you to access information based on column name. Finally, result.finish means we’re done using the result set given to us.
The rescue section simply makes sure to notify of an error if something goes wrong. Also, ensure does just what it says, ensures that the connection is closed if it exists.
Surely the output right now is rather boring because you don’t have any data in your table. Lets put some in, and lets do it with a prepared statement. That way when you build a Ruby app with user input, you’re less susceptible to SQL injection.
cmd = con.prepare "INSERT INTO customer(name,telephone) VALUES(?,?)" cmd.execute "Bobby","2229998888"
Remember working with result.fetch_hash? What about if you want the entire dataset? What about just the first row? You’re in luck, Ruby makes it easy.
#returns an array of DBI::Row objects result.select_all "SELECT * FROM customer" #returns the first row result.select_one "SELECT * FROM customer"
Database access with Ruby is an important skill to know if you’re dealing with storing any sort of information. With this brief overview, grasping the idea should be straight forward.
Over break, I’ve been experimenting with Ruby a little bit. It’s rather different from the C#/Java syntaxes I’ve leared through schooling. However, I’ve enjoyed using it. And for the reference of myself and others, I’m slapping up a how-to for creating a simple socket server and client.
require "socket" serv = TCPServer.new('localhost',7885) count = 0 loop do Thread.start(serv.accept) do |s| count += 1 s.write "You are visitor #{count} to my TCP Ruby Server" s.close puts "New visitor: #{count}" end end
The server begins by getting the socket library that already comes with Ruby. It then creates an instance of TCPServer, binding it to localhost on port 7885. Count is used to keep track of the number of incoming connections the server receives.
Enter the loop, a new thread will be started upon the blocking method serv.accept and during the life of the thread, it will increment the count, send a message to the visitor, close the connection, and finally write the user count to the console.
require "socket" client = TCPSocket.new("localhost",7885) str = "" while (add = client.recv(100)) != "" str += add end puts str client.close
The client simply receives all output sent to it. It opens a connection with localhost on port 7885. After connecting, it recieves a buffer of 100 bytes until no more are read. The buffer is appended to the final output string, and the connection is closed.
And there you have it, a simple Ruby socket connection. From here you can create such things as a web server, or writing your own protocol. Enjoy!