July 30, 2010

Chris' adventures at Product Fair 2010

ChrisMicrosoft Product Fair 2010As a Staffing Manager in the Windows/Windows Live Division, I have the opportunity to see broadly across what’s going on with the Windows Division. Additionally, I sometimes get a glimpse into how our efforts connect to other Microsoft business units around the world, which is great, especially if you’re the kind of person who took apart a Rubic’s Cube just to see how it works. So, I jump at any chance to get an even bigger or different perspective of Microsoft as a whole.

Yesterday, I walked over to the 2010 Microsoft Product Fair which was held at the Soccer Field on the Redmond Campus. Yep, there’s a soccer field right in the middle of campus. The Seattle Sounders have come out a few times and practiced here; last year Nate Jaqua signed a poster for my son and mentioned that he really liked the field – I guess That’ll Do. Anyway, back to the Fair:

It’s a bunch of tents, big tents, with folks from different businesses and teams showing off the newest developments in their products or services, many I recognized and a few that I only knew on a superficial level.

I can’t comment on a lot of the things I saw, mostly because I’m not allowed to, but partly because after the person spent two or three minutes explaining their technology or feature, I realized I’d need I.Q. Steroids and a Comp-Sci degree to really get it. On an entirely related note, there’s just something very cool about talking with someone who truly is “top of the list” when it comes to talent in their field – even if I don’t fully understand it. Perhaps, because I don’t fully understand it.  

I also participated in an Office 2010 contest where I raced a clock to complete different scenarios in an Office 2010 product, in my case, Excel. The scoring thing wasn’t able to get me a final score – I’m consoling myself that I over “Excel-ed” it.

Overall, I was able to see a lot of great stuff there, some headed to production, some in R&D that may never become a “product” but rather a feature of an existing or upcoming product, and some like new Xbox features or accessories. 

The highlight of the day for me = Kinect

Fellow JobsBlogger/colleague Kenji and I played a couple games of “Kinect Adventures,” which quickly highlighted how out of shape I’ve become, and that I should never consider starting a River Rafting company – it’s already clear that my 11 year-old will own me.

By the way, did you know that in Office 2010 Excel - when you press <Ctrl> and click “Total” – a picture of Chuck Norris pops up?

- Chris

Work at Microsoft!

July 29, 2010

Making music on Windows 7 - Microsoft Product Fair 2010

Editor's Corner: ThomasCheck out this footage from Microsoft Product Fair 2010: employees learn to make music using Windows 7 (64-bit) with DJ Darek Mazzone. Afterwards, they take home their new music on a jump drive.

Darek discusses how Windows 7 "is a revolutionary operating system in the creative space." This was one of dozens of amazing products that employees were able to sample at the Product Fair this week.

Work at Microsoft!

July 28, 2010

Running Code Coverage from the Console with dotCover

As of the beta* of dotCover, we included a Console runner to run coverage using the command line, allowing for instance, setup of dotCover in a Continuous Integration environment. Let’s see how it works.

*To get all these features in this post, you need to download the latest nightly build

 

Console Runner

The Console runner is located under the installation folder (%Program Files%\JetBrains\dotCover\v1.0\bin\dotCover.exe). The best option is to add it to the system path so as to be able to run it from anywhere. The runner accepts several commands based on the operation we want to perform. Each of these commands in turn takes one parameter, which is an XML configuration file. The commands are:

  • cover: Coverage of specified application
  • merge: Merges snapshots
  • report: Creates an XML report of the coverage results
  • list: Lists snapshots
  • delete: Deletes snapshots
  • analyse: Provides an all-in-one analysis and output

We can also obtain a list of commands by typing dotCover.exe help on the command line:

image

As shown in the figure above, we can find out more about each command by typing help followed by the command. All commands (including help) have corresponding shortcuts. For instance, to get information about analyse, we can type:

dotCover.exe help analyse

or

dotCover.exe h a

Since all commands take as parameter an XML file, when requesting help, all they do is generate a sample XML with comments indicating what each element means. The previous command would therefore output:

image

The obvious advantage to this is that we can easily get a new configuration file setup by just asking for help and piping the output to some file. In fact, adding a third filename parameter, dotCover will do this for us, without us having to touch up the output (remove headers, etc.).

Coverage in one command

The dotCover console runner is flexible in that it complies with most common requirements in a Continuous Integration setup. Normally in projects, we have multiple test projects we need to cover. We might then want to see the results separately or merge them as one report. dotCover allows for this kind of flexibility. However, in many cases, all we want to do is run coverage on a single test project and see the results. We’ll see how to do that in a single step in this blog post and in a future post we’ll get into the details or merging, listing, etc.

Let’s see how to do this step by step. I’ll be using a sample app that I’ve prepared for this blog post (using MSpec). You can adjust the parameters to your own project paths and unit testing framework as required.

 

  1. Type dotcover help analyse coverage.xml on the command line. I’m doing this in the root folder of my solution.

image

  • Add coverage.xml to the Solution and open it up in Visual Studio
  • Eliminate all elements except the following

     image

  • Fill in the elements with the corresponding values:

    Executable: Is the path to the unit testing framework runner (in this case mspec.exe)
    Arguments: Arguments passed to the runner, the test assemblies (specs.dll)
    WorkingDir: Optional path to the working folder (provide this to not have to fully qualify test assemblies)
    Output: XML Report containing coverage information. Relative to the coverage.xml path (output.xml)

     image

  • That’s it in terms of configuration. Now all we need to do is run it by typing dotCover analyse coverage.xml. If all is configured correctly, we should see the following output:

    image

    and there should be an output.xml file located in the same folder. Opening it up we can see the coverage results

    image

  •  

    Filtering from the Console

    Looking at the results, we can see that there is a lot of noise, that is, assemblies that we’re not interested in such as the tests, MSpec assemblies, etc. In a previous blog post, we saw how to filter these out using dotCover GUI. Let’s see how we can do this via the command line. If we generate the default coverage configuration, we can see that there is a filter section:

    image

    These are used to filter assemblies. We can indicate which assemblies we want covered and which we don’t. In order to filter out all assemblies and concentrate only on our actual code (ClassLib), we can do one of two things:

    1. Filter out all non-required assemblies individually by adding them to the ExcludeFilters section, as in:

     image

  • Exclude everything that’s not specifically in the Include section. This is done by leaving an ExcludeFilter section blank and placing what we want covered in the IncludeFilter:

    image

  • Based on the needs and depending on which of the two lists (Include or Exclude) are smaller, we can opt for option 1 or 2. The ouput should now be filtered:

    image

     

    Giving it some style

    Only thing left to do is to format the output so that it can be viewed nicely inside a browser. This can be done easily using XSLT and some appropriate design skills (something I lack).

    image 

    Included in the project you have an XSLT to create some HTML as well as one to covert the XML to JSON. Having JSON output allows us to combine it with something like jQuery to display a nice treeview or grid with the results.

    July 26, 2010

    YouTrack for OSS Projects

    Quite some time ago, several OSS leads asked us about whether JetBrains could provide OSS license for YouTRACK. We’ve taken it one step further, and in collaboration with the great folks at CodeBetter and Devlicio.us, we are pleased to announce YouTRACK at CodeBetter. Having TeamCity running most of the OSS projects on there, it only made sense to offer the same facilities for issue tracking at the same location. And seeing that YouTRACK integrates with TeamCity, we can offer all the functionality the combination is capable of.

     

    image

     

    I’d like to thank Kyle Baley, James Kovacs and Bredan Tompkins for making this possible, as without their support this wouldn’t happen. Not only are they generous in hosting YouTRACK, but also ran the risk of providing me with the credentials to the server, and at one point it looked like it had all gone to hell. Luckily however all seems to be working great now.

    What are the next steps?

    If you are an OSS project lead and  would like to use YouTRACK for your issue tracking, please email teamcity@codebetter.com with your details and you’ll be up and running in no time. If you want to learn more about YouTRACK, please visit our web site.

    Events This Week – July 26th, 2010

    Here are the events listed in Community Megaphone for the next week (or so) for the Mid-Atlantic area, as well as webcasts of interest…this list includes events imported from the UGSS event calendar, and all events entered in Community Megaphone are also automatically synced to the UGSS event calendar:

    • Using the New Features in Reporting Services 2008 R2
      Tuesday, July 27, 2010 11:00 AM, Online
      SQL Server Reporting Services has been around for a number of years now and the 2008 R2 release provides DBAs and Developers with a truly enterprise class reporting platform. If your company has not embraced Reporting Services fully yet then now is the time! Come see not only some of the new features that are available but also how you can leverage the environment to provide custom solutions to fit your extensive business needs.
      [ Event Details | Add To Calendar ]

    • Visual Studio 2010 Community Launch - Roanoke Valley
      Tuesday, July 27, 2010 1:00 PM, Roanoke, VA
      Visual Studio 2010 is now available, and this version of the IDE promises to be the best ever. Come and see presentations on Visual Studio 2010 and the new .NET Framework 4.0 features by top regional speakers. If you are a .NET Developer or would like to learn more about this exciting platform, then you won't want to miss this event! All are welcome to attend, but seating is limited so reserve your spot now! And be sure to tell your friends!
      Come and learn about the exciting changes to Visual Studio and the .NET Framework while you socialize and network with fellow developers. This seminar offers three (3) presentations including:
      * Introduction to .NET Framework 4.0
      * Whirlwind Tour of Visual Studio 2010
      * What's Coming in Windows Phone 7
      [ Event Details | Map & Directions | Add To Calendar ]

    • Introduction to Windows Azure
      Wednesday, July 28, 2010 6:30 PM, Washington, DC
      Join us for the David Makogon's Introduction to Windows Azure presentation.
      Agenda:
      6:30 PM - 6:45 PM Pizza! Beverages! Networking!
      6:45 PM - 7:00 PM Sponsor's time
      7:00 PM - 8:30 PM Introduction to Windows Azure - David Makogon
      8:30 PM - 8:45 PM Raffle for some great giveaways!
      Spread the word, bring a friend, hope to see you there!
      [ Event Details | Map & Directions | Add To Calendar ]

    • DC ALT.NET - ASP.NET MVC View Engines
      Wednesday, July 28, 2010 7:00 PM, Alexandria, VA
      Session:
      View Engines: Razor, Spark, & NHAML
      Overview:
      You may be asking yourself, "Self, what is a view engine and why would I want one?" A view engine is a library that takes an HTML-like template, converts that template into code, and then executes that code to get HTML for return to the browser. If you're using ASP.Net MVC, you're already using a view engine whether you know it or not: the WebForms view engine.
      In this presentation you'll learn how to install and configure alternatives to the standard WebForms view engine in ASP.Net MVC, and you'll walk away knowing how to utilize them to improve your productivity. The view engines we'll cover are:
      * Razor, the new view engine from Microsoft.
      * Spark, a popular OSS alternative.
      * NHAML, another OSS alternative straight from Ruby-land.
      About Our Speaker:
      Troy Goode is a lead developer with NRECA in Arlington, VA. Previously he worked as a consultant for the World Bank, National Academy of Engineering, the Institute of Medicine, and other agencies & associations in the DC area. He lives in Arlington with his wife and two dogs.
      [ Event Details | Map & Directions | Add To Calendar ]

    • Develop Cloud Computing Apps on force.com
      Thursday, July 29, 2010 7:00 PM, Alexandria, VA
      The Force.com development platform makes building applications faster and easier than ever. It includes a database, security, workflow, user interface, and other tools that step you through the process of building powerful business apps, mobile apps, and Web sites. During this meeting Tony Kim will show us how to build and deploy applications on Force.com.
      [ Event Details | Map & Directions | Add To Calendar ]

    • Parallelism and Performance: Are You Getting Full Return on Your CPU Investment?
      Tuesday, August 03, 2010 12:00 PM, Online
      Presenter: Adam Machanic
      Abstract:
      In today's multi-core-driven world, query performance is very much determined by how well you're taking advantage of the processing power at your disposal. Are your big queries using every available clock tick, or are they lagging behind? And if your queries are already going parallel, can they be rewritten for even greater speed? In this session you will learn the background necessary to take full advantage of parallelism. We'll cover what parallelism is, why it's important, and the basics of how to read parallel query plans. Examples will be shown to illustrate some of the huge performance gains that can be had when we learn to properly control SQL Server's parallel processing capabilities. This session is a small preview of some of the material that will be covered in Adam Machanic's full-day PASS Summit post-con, "A Day of Doing Many Things at Once."
      About Adam:
      Adam Machanic is a Boston-based independent database consultant, writer, and speaker. He has been involved in dozens of SQL Server implementations for both high-availability OLTP and large-scale data warehouse applications, and has optimized data access layer performance for several data-intensive applications. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at user groups, community events, and conferences on a variety of SQL Server and .NET-related topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, Microsoft Certified IT Professional (MCITP), and a member of the INETA North American Speakers Bureau.
      [ Event Details | Add To Calendar ]

    • Ft Lauderdale Online: DALLAS "Azure and Data as a Service", How cloud computing has created a new market for Interactive Data Services
      Tuesday, August 03, 2010 6:30 PM, Online
      08/03/2010 - 6:30 PM - Online - Learn about the new service from Microsoft, code name "Dallas" that will allow your data to be exposed and for you to collect revenue!
      [ Event Details | Add To Calendar ]

    Want your events listed? You can add them here.

    You can also add your events via the Community Megaphone web service API, which is now live. You can get more information on the API, and how to sign up, at http://www.communitymegaphone.com/API.aspx. You can also email me for more information.

    July 20, 2010

    Writing plug-ins for ReSharper: Part 2 of N

    Finally I’ve managed to get the second part of the post on plug-ins. Sorry for the delay to everyone that was waiting. Appreciate your patience.  And now we’ll resume my holidays!

    In the previous part of this series, we saw the basics of how to create a plug-in for ReSharper, install it and run it. We created a context action that would allow us to mark a public method as virtual (where applicable). However, this was done as an explicit action by the user, as such, you didn’t get any kind of hint or suggestion to do this. What we want to do now is make this a hint, so that highlighting appears under methods that could be made virtual. In this part we are going to expand on the same plug-in and convert it into a QuickFix.

    What is a QuickFix?

    Have you seen the little squiggly lines that appear in Visual Studio?

    image

    They usually indicate a Suggestion (field can be made read-only), Warning (possible null reference) or Error. ReSharper analyzes and can detect potential issues in the code (similar to what static checker of Code Contracts does). These are known as Highlights and they are related to QuickFixes in that usually a highlight has an QuickFix associated to it, which invokes a context action. This is usually done by placing the cursor on top of the highlighting and press Alt+Enter

    image

     

    Highlighting Daemons

    In the gutter of the Visual Studio editor (right-side), ReSharper displays a series of warnings, errors and hints, which indicate potential issues on a specific file. These issues are detected by background processes known as Daemons. Since what we are looking for is for ReSharper to warn us of existing methods that could be made virtual, what we need to do is somehow hook into these daemons.

    image

     

    Step by Step Guide

    The Daemons in ReSharper use the Visitor pattern to use act on elements, be it code, files, etc. The first step is to implement an IDaemonStage interface, which hold metadata about our daemon stage at at the same time acts as a factory for the actual process we are implementing.

    [DaemonStage(StagesBefore = new[]  { typeof(LanguageSpecificDaemonStage) })]
     public class MakeMethodVirtualDaemonStage: IDaemonStage
     {
         public IDaemonStageProcess CreateProcess(IDaemonProcess process, DaemonProcessKind processKind)
         {
             return new MakeMethodVirtualDaemonStageProcess(process);
         }

         public ErrorStripeRequest NeedsErrorStripe(IProjectFile projectFile)
         {
             return ErrorStripeRequest.STRIPE_AND_ERRORS;
         }
     }

    There are two main methods to implement. The CreateProcess is what creates the actual process for us and the NeedsErrorStrip which indicates whether this daemon uses the gutter to display strips. The DaemonProcessKind parameter passed into the first method helps us discriminate on when this process should be executed, i.e. only during checking of visible (current) document, during solution wide analysis, etc.

    The next step is to implement the process via the IDaemonStageProcess interface:

      public class MakeMethodVirtualDaemonStageProcess : IDaemonStageProcess
      {
          readonly IDaemonProcess _process;

          public MakeMethodVirtualDaemonStageProcess(IDaemonProcess process)
          {
              _process = process;
              
          }

          public void Execute(Action<DaemonStageResult> commiter)
          {
              if (_process.InterruptFlag)
              {
                  return;
              }

              var file = _process.ProjectFile.GetPsiFile(CSharpLanguageService.CSHARP) as ICSharpFile;

              if (file != null)
              {
                  var highlights = new List<HighlightingInfo>();

                  var processor = new RecursiveElementProcessor<IMethodDeclaration>(declaration =>
                  {

                      var accessRights = declaration.GetAccessRights();

                      if (accessRights == AccessRights.PUBLIC && !declaration.IsStatic && !declaration.IsVirtual &&
                          !declaration.IsOverride)
                      {
                          var docRange = declaration.GetNameDocumentRange();

                          highlights.Add(new HighlightingInfo(docRange, new MakeMethodVirtualSuggestion(declaration)));
                      }
                  });

                  file.ProcessDescendants(processor);
                  
                  commiter(new DaemonStageResult(highlights));
              }

          }

          
      }

    The main meat of this class is in the Execute method. We first check to make sure that we’ve not received an interruption (Interrupt Flag raised) due to some external action. Next step is to get access to the current file (remember that we are visiting the entire visible document, not just a specific method). Having the file, we can now create a RecusiveElementProcessor* to perform a tree walk of the AST and perform the specific action on each element. The action to perform is declared as the lambda expression. Since we’re interested in the method declaration, the type is IMethodDeclaration (there are many others). If we look at the expression, we can see that it’s pretty much the same as that of Part 1, the only difference is that we add the results to the highlighting variable.

    The HighlightingInfo class has a parameter which can be a Suggestion, Warning or Error, as explained previously. Since in our case we need a suggestion, we pass in the MakeMethodVirtualSuggestion:

    [StaticSeverityHighlighting(Severity.SUGGESTION)]
     public class MakeMethodVirtualSuggestion : CSharpHighlightingBase, IHighlighting
     {
         public ICSharpTypeMemberDeclaration Declaration { get; private set; }

         public MakeMethodVirtualSuggestion(ICSharpTypeMemberDeclaration memberDeclaration)
         {
             Declaration = memberDeclaration;
         }

         public string ToolTip
         {
             get { return "Method could be marked as virtual"; }
         }

         public string ErrorStripeToolTip
         {
             get { return ToolTip; }

         }

         public override bool IsValid()
         {
             return Declaration.IsValid();
         }

         public int NavigationOffsetPatch
         {
             get { return 0; }
         }
     }

    This class is pretty simple. The main property to define is the ToolTip, which is the text that will show when we hover of the highlighting. The ErrorStripeToolTip is what’s displayed in the right-hand side gutter. Finally the Attribute StaticSeverityHighlighting is to indicate what type of tip it is (Warning, Error, etc.).

     

    [*Note: In this case, the operation we want to perform is very simple. If we want a more complex scenario where we need to do some processing before and after each element is visited or have a more fine-grained control, we can implement the IRecurisveElementProcessor. I’ll cover this in another post]. 

     

    To recap, right now we would have everything place to display highlighting when a method that could be made virtual is encountered. The only remaining part is to now be able to apply a QuickFix. This is in many ways similar to the ContextAction we saw in Part 1:

    [QuickFix]
    public class MakeMethodVirtualQuickFix : BulbItemImpl, IQuickFix
    {
        readonly MakeMethodVirtualSuggestion _highlighter;

        // Takes as parameter the Highlighter the quickfix refers to
        public MakeMethodVirtualQuickFix(MakeMethodVirtualSuggestion highlighter)
        {
            _highlighter = highlighter;
        }

        // In the transaction we make the necessary changes to the code
        protected override Action<ITextControl> ExecuteTransaction(ISolution solution, IProgressIndicator progress)
        {
            _highlighter.Declaration.SetVirtual(true);

            return null;
        }

        // Text that appears in the context menu
        public override string Text
        {
            get { return "Make Method Virtual"; }
        }

        // Indicates when the option is available
        public bool IsAvailable(IUserDataHolder cache)
        {
            return _highlighter.IsValid();
        }
    }

    The MakeMethodVirtualQuickFix needs to implement the IBulbItem and IQuickFix interfaces. For ease of implementation we can inherit from BulbItemImpl. The constructor should take as parameter always the actual highlighting that has given way to invoking the QuickFix, in our case the MakeMethodVirtualSuggestion. Similar to the ContextAction we implemented in Part 1, the actual fix itself is pretty trivial. All we need to do is make the method virtual. How do we get access to the method? The easiest way is via the Declaration property of the highlighting passed in (this is a property we added before). The only thing left is to call the SetVirtual method on it. Since we are in the ExecuteTransaction method, ReSharper makes sure that any change made is executed as a whole.

    The rest of the properties are trivial. Text returns the text of the QuickFix (what appears in the menu), and IsAvailable indicates when the QuickFix is available, which in our case is whenever the highlighting is valid.

     

    The End Result

    Once we compile the plug-in and place it in the corresponding Plugins folder under ReSharper\Bin, we’re done. Here’s the end result:

    image

    and invoking Alt+Enter on the highlighting gives us:

     

    image

     

    Summary

    Extending ReSharper to create highlightings and quick fixes is pretty simple once you understand how all the pieces fall into place. Most of the code will usually be the same and what will vary will be the actual element processing to be performed and the corresponding QuickFix. As mentioned previously (in the Note), for complex scenarios, we can have more control over the tree walk and that’s something we’ll examine in a future post.

    I’ve placed the code up on my github account so feel free to download it, play with it and ping me if you have any comments or questions. The code is updated to work with ReSharper 5.1

    [Thanks to Howard for his valuable input]

    July 19, 2010

    Events This Week – July 19th, 2010

    Here are the events listed in Community Megaphone for the next week (or so) for the Mid-Atlantic area, as well as webcasts of interest…this list includes events imported from the UGSS event calendar, and all events entered in Community Megaphone are also automatically synced to the UGSS event calendar.

    Note that the Blend-O-Rama webcast series runs all week, so if you missed today’s webcast, there are still 4 more that  you can catch, and organizers Joel Cochran and Kevin Griffin are recording all of the sessions:

    • Blend-O-Rama
      Monday, July 19, 2010 11:30 AM, Online
      Blend-O-Rama is a series of 5 lunch and learns designed to take you from zero to hero in Expression Blend.
      By registering for the event, you'll be sent reminders for each of the sessions with a link to the LiveMeeting log in page.
      Blend-O-Rama is being presented by self-described "Blend Evangelist" Joel Cochran. The event is being sponsored by Kevin Griffin, and the resources provided by the Hampton Roads .NET Users Group.
      [ Event Details | Add To Calendar ]

    • The IT Pro's Learning Guide For SQL Server 2008 R2
      Saturday, July 24, 2010 9:30 AM, Online
      Location: This is an online meeting via Live Meeting
      If you’re an IT Professional and you know little or nothing about SQL Server 2008 R2, you are in trouble. Microsoft's fastest selling platform, SharePoint, and their newest platform, Azure, rely heavily on the SQL Server environment. We are going to show you how to learn about these environments so that you can elevate your skill set to be able to plan, support, manage and maintain these environments.
      Pre-registration is REQUIRED, please go to:
      http://www.clicktoattend.com/?id=149048 to register
      Access information will be sent to you by email
      [ Event Details | Add To Calendar ]

    • Using the New Features in Reporting Services 2008 R2
      Tuesday, July 27, 2010 11:00 AM, Online
      SQL Server Reporting Services has been around for a number of years now and the 2008 R2 release provides DBAs and Developers with a truly enterprise class reporting platform. If your company has not embraced Reporting Services fully yet then now is the time! Come see not only some of the new features that are available but also how you can leverage the environment to provide custom solutions to fit your extensive business needs.
      [ Event Details | Add To Calendar ]

    • Visual Studio 2010 Community Launch - Roanoke Valley
      Tuesday, July 27, 2010 1:00 PM, Roanoke, VA
      Visual Studio 2010 is now available, and this version of the IDE promises to be the best ever. Come and see presentations on Visual Studio 2010 and the new .NET Framework 4.0 features by top regional speakers. If you are a .NET Developer or would like to learn more about this exciting platform, then you won't want to miss this event! All are welcome to attend, but seating is limited so reserve your spot now! And be sure to tell your friends!
      Come and learn about the exciting changes to Visual Studio and the .NET Framework while you socialize and network with fellow developers. This seminar offers three (3) presentations including:
      * Introduction to .NET Framework 4.0
      * Whirlwind Tour of Visual Studio 2010
      * What's Coming in Windows Phone 7
      [ Event Details | Map & Directions | Add To Calendar ]

    • Develop Cloud Computing Apps on force.com
      Thursday, July 29, 2010 7:00 PM, Alexandria, VA
      The Force.com development platform makes building applications faster and easier than ever. It includes a database, security, workflow, user interface, and other tools that step you through the process of building powerful business apps, mobile apps, and Web sites. During this meeting Tony Kim will show us how to build and deploy applications on Force.com.
      [ Event Details | Map & Directions | Add To Calendar ]

    Want your events listed? You can add them here.

    You can also add your events via the Community Megaphone web service API, which is now live. You can get more information on the API, and how to sign up, at http://www.communitymegaphone.com/API.aspx. You can also email me for more information.

    July 13, 2010

    Screencast: Overview of dotCover

    In this short screencast you can see the basics of dotCover and how to get up and running in a matter of minutes.

     

    JetBrains E-Shop, Community Blackout Recovery Schedule

    Hello all,

    This is a post for those of you who’s currently trying (and failing) to buy something from JetBrains, or to submit a bug into our issue tracker, or contribute to one of our community resources.

    Today the extreme heat in Europe finally took a toll on us in the form of blackout that temporarily took down a part of our online assets.

    While the majority of resources at jetbrains.com is available, our e-shop section is down (which is why you aren’t currently able to order any of our products), as are our bug tracker, wiki, and forums.

    We’re regularly getting updates on the recovery progress, and the current (quite rough) recovery time estimate is today evening, 19:00 CET.

    We apologize for the inconvenience, and we’re really hoping for a quick recovery.

    Update! Seems like it’s all over, and on schedule! Power supply is resumed, meaning that YouTrack, Confluence, and JetBrains e-shops are now back online, and our support service is working well!

    July 08, 2010

    Filtering with dotCover

    dotCover allows us to run coverage analysis on our code. However, there are times when we do not want to perform an analysis on certain areas. This could be our test assemblies, certain third-party assemblies or even specific parts of our own project. Since the analysis has an impact on the overall statistics and potentially can take longer, it is often interesting for us to filter certain assemblies or classes out.

    The figure below shows the coverage report of the default MVC project (yes, yet another MVC example, but it ships in the box and probably the ONLY template with VS that comes with tests). In this case we’re using xUnit for tests, which is mostly to demonstrate that dotCover works with other frameworks, not only MSTest (personally I’ve moved on to MSpec, which dotCover also supports).

    SNAGHTMLfe87ba

    If we focus on the Coverage results we can see that there are some areas in grey, which indicate that the corresponding PDB files were missing so coverage could not take place. We can also see other assemblies that have been included in the results yet might not be of interest, such as the test assemblies.

    image 

    In order to filter these out, we can use the Coverage Filters, which can be accessed via the dotCover menu in the IDE

    SNAGHTML105d878

    By default, everything is set to be covered, as shown by the Everything entry in the Allow Filters tab (which most likely will be renamed to Included). Examining this entry (clicking on Edit) we can see that it is composed of three values:

    SNAGHTML107f54b

    • Module Mask indicates the project name
    • Class Mask indicates the class name
    • Function Mask indicates the method name

    (All entries support wildcards as displayed by the * that appears)

    We can now uses these patterns to exclude projects, assemblies, namespaces, classes and methods from out tests.

    Excluding entire project

    Click on Deny Filters tab (most likely will be renamed to Excluded) and click on Add, entering the following information:

    SNAGHTML10eb8b9

    Clicking OK  and re-running the tests, we now see that the entire test assembly has been excluded

    Excluding an entire Namespace

    We can exclude entire namespaces by setting it in the Class Mask as shown below:

    SNAGHTML11da114

    this will exclude all the classes inside the MvcApplication16.Models namespace. If we just wanted a specific class, we add it after Models instead of *.

    Excluding a method

    We can exclude a method, similar to how we’ve done it with the previous cases:

    SNAGHTML12173ed

     

    Adding a series of filters for test and auxiliary assemblies using the previous steps, we can end up with a cleaner and more accurate coverage report:

    SNAGHTML1242f6e

    produced by adding the following filters:

    SNAGHTML1252438

    Notice that by using the *.Tests (or alternatively as I call it *.Specifications), we can automatically exclude all our test projects from code coverage.

     

    Beta Disclosure

    Note 1: Currently these settings are Global, that is, they apply to all projects. Most likely we’ll be changing it so that they are Solution and Global scoped. This way, common libraries can be excluded for all solutions, and specific projects and/or namespaces can be solution based.

    Note 2: We’re also looking at adding functionality to make it easier to define these exclusions (a la Right-Click on Folder, add to Excluded list)

    As always, we’re still in beta and feedback is more than welcome!

    July 07, 2010

    dotCover 1.0 Beta Released

    We are happy to announce the release of dotCover 1.0 Beta, the latest addition to the .NET tools from JetBrains. dotCover is a Code Coverage tool providing you with information on how much of your code is covered by unit tests or execution. In this latest release, you can find the following functionality among others:

    Code Coverage based on Unit Tests integrated with ReSharper Test Runner

    By running your tests, dotCover can measure how much of your code is covered by these and potentially show you point of failures in your code. As opposed to the coverage tools that ship with Visual Studio, dotCover supports the majority of OSS frameworks such as NUnit, ,xUnit, MSpec, etc. as well as of course, MSTest. And best of all, you get to use ReSharper’s test runner, including sessions, keyboard shortcuts, etc.

    image

    Code Coverage based on Code Execution

    Don’t have unit tests? Ask yourself why not, and get to work! But in the meantime (during your breaks from catching up on all those unit tests) you can still take advantage of dotCover’s coverage techniques. By executing your application, you can detect what parts of your code have been called and which parts are left out in the cold, during a typical usage scenario.

    image

    Console Runner

    Want to automate code coverage in a Continuous Integration environment or call dotCover from your build scripts? You can do it easily using the Console Runner that ships with dotCover. In future versions we’ll even provide first class integration with TeamCity.

    image

    Filtering Coverage

    dotCover provides functionality to filter out code from being covered, be it test assemblies or others. With a few simple settings you can define exactly what you want to be covered.

    SNAGHTML4958a0e

    Covering Tests

    Want to know what tests cover a certain section of code? No problem. With a simple operation, dotCover displays all tests covered by specific code.

    image

    IDE Integration

    Tight IDE integration allows displaying of code coverage directly within the IDE without having to switch tools or change context. See what lines of code have been covered by a running test and which have not with visual highlighting.  You can even define your own colors.

    image

    More to come…

    This is just the beginning of what we have planned for dotCover. Download the beta now and gives us your feedback. We’re very eager to hear about it!

    ReSharper 5.1: Bug Fixes, Performance, XAML 2009

    ReSharper 5 gets its first official update today with the launch of the new ReSharper 5.1, the bug fix and performance tuning release that additionally features support for XAML 2009.

    As expected, integrating with the new Visual Studio turned out to be one of the greatest challenges for ReSharper developers. It took us 2.5 months to collect feedback from 5.0 RTM and 5.1 EAP users all around the world, reproduce integration problems and other kinds of issues, fix them, and finally come up with something that we’re ready to make an official bug fix release.

    ReSharper 5.1 release brings together lots of critical issues resolved in prior nightly builds. Improvements include:

    • Typing latency in ASP.NET Web Forms and MVC projects is considerably reduced.
    • Markup files in in ASP.NET projects don’t lose references to code-behind files when saving files.
    • No more error highlighting over good code in web projects without obvious reasons (for example, on saving files.)
    • Visual Studio crashes triggered by ReSharper activity in several scenarios are diagnosed and fixed.
    • No more memory leak on closing and reopening solutions in Visual Studio 2010.
    • Ctrl+click (Go to Declaration) in Visual Studio 2010 now works consistently, without forcing you to click a code symbol multiple times.
    • Dialogs and tool windows that use tree controls are now rendered much faster than before.
    • Splitting Visual Studio text editor tabs does not hide ReSharper features in neither tab.
    • Silverlight 4 support is fixed: when developing server-side and client-side assemblies in Silverlight 4, you don’t get false error highlighting anymore.
    • Encoding issues are over: specifically, adding Cyrillic comments and creating custom controls in WPF projects doesn’t change file encoding.
    • The locale doesn’t anymore change based on Windows locale during ReSharper refactorings.

    Here’s the complete release notes.

    If you’re still experiencing serious issues with ReSharper 5.1, please check against the Known Issues and Workarounds blog post to make sure that your problems are not related to Visual Studio itself or another external tool. If you’re still experiencing issues, you know where to find our bug tracker.

    In addition to bug fixes, ReSharper 5.1 introduces support for XAML 2009 that includes highlighting and quick-fixes for language errors.

    To illustrate this, let’s take a language version downgrade scenario: say you’re working on a WPF project where only XAML 2006 is allowed there but you’ve got a generic object from XAML 2009. First of all, ReSharper highlights the object as an error:

    When you press Alt+Enter on the highlighting, there’s a quick-fix in the list where ReSharper suggests to declare a type inherited from System.Collections.Generic.List<string>:

    When you apply the quick-fix, ReSharper makes two things. First, it creates a new .cs file where it declares a new wrapper type that inherits from List<string>:

    Second, in the original XAML file, it deletes the TypeArguments attribute, changes the type of the object to the new wrapper type, and inserts a new namespace directive if necessary:

    With this ReSharper 5.1 release, we’re closing the ReSharper 5.1 Early Access Program in order to focus on ReSharper 6 development. We’re really hoping that critical issues that affect multiple users are firmly behind us and we won’t have to reopen the 5.1 EAP.

    Download ReSharper 5.1

    July 06, 2010

    Show Covering Tests with dotCover

    One of the new features dotCover has added is the ability to find tests that cover a certain piece of code. Something remotely similar has been available in ReSharper, although it has been kind of an archaic solution (i.e. Find Usages on Method calls, locate Test assemblies in Result window).

    dotCover makes this easier by providing quick access to this information and extends it in functionality.

    Below we can see some tests from an MVC application. Let’s run Code Coverage on it using dotCover first.

    image

    Now let’s switch over to the Source Code and select some random source code, in this case the MembershipService.ChangePassword line in the ChangePassword action:

    image

    In order to see the tests that cover this line of code, we can either press the default key combination of Ctrl+Alt+K or select the option Show Covering Tests from the dotCover:

    image

    dotCover will then display a small window showing all the different tests that cover that line of code.

    image

    At this point, we can run the selected tests or add them to the existing ReSharper Test Runner session. This allows us to easily jump from specific sections of code to the corresponding tests and execute them instantly.

    One minor note: The default key mapping conflicts with KeePass, but you can easily re-assign it via Visual Studio Tools | Options | Keyboard, or do as I did and change KeePass to Ctrl+Alt+P (P as in Password…makes more sense).

    SNAGHTML5841541

    July 02, 2010

    dotTrace 4 Pricing

    Good news today: we have finalized pricing options for both new dotTrace 4.0 licenses and upgrades.

    Before you familiarize yourself with the pricing scheme, here’s a quick reminder of the new dotTrace product and editioning scheme:

    • dotTrace stops being an all-in-one profiler and splits into two products: dotTrace Performance and dotTrace Memory.
    • dotTrace 4.0 Performance is scheduled for release in August 2010, in two editions: Standard and Professional. Here’s how the two editions compare. In short, compared to the Standard edition, dotTrace 4.0 Professional adds remote profiling, support for Silverlight 4 and .NET Compact Framework 3.5.
    • dotTrace 4.0 Memory will not be released simultaneously with the new performance profiler but rather in a few months after it. The current release schedule for dotTrace 4.0 Memory is Fall this year.
    • Because of the asynchronous release dates of the two 4.0 profilers, for the time being, we’ll make available dotTrace 3.5 Memory. As opposed to dotTrace 4.0 Performance which is a totally redesigned product, dotTrace 3.5 Memory is the memory profiling part cut from dotTrace 3.1 and reinforced with support for CLR 4 applications.

    After due clarification of terms, here’s a couple of summary points regarding upgrades:

    1. We’re ready to announce prices for upgrading from your existing dotTrace licenses to dotTrace 4 Performance and dotTrace 3.5 Memory. Regarding upgrades to dotTrace 4.0 Memory, we’ll update you on that this Fall, as soon as we approach the dotTrace 4.0 Memory release.
    2. You’ll be able to upgrade your existing dotTrace license in any way you want:
      • To dotTrace 4.0 Performance.
      • To dotTrace 4.0 Memory.
      • To the bundle of these products.
    3. If you have purchased your dotTrace license on or after December 17, 2008, you’ll get a free upgrade to the bundle of dotTrace 4.0 Performance Professional + dotTrace 3.5 Memory Standard. This applies to any kind of license: personal, per-developer, or floating.
    4. If you have purchased any dotTrace license before December 17, 2008, you’ll get dotTrace 3.5 Memory Standard for free (this free upgrade opportunity, once again, is valid for all kinds of licenses), and an option to upgrade to dotTrace 4.0 Performance at a price according to the following upgrade table. All upgrade rates are roughly 60% of corresponding new license rates.



    Table 1. dotTrace pricing: upgrade licenses

    Upgrade to: dotTrace 3.5 Memory dotTrace 4.0 Performance dotTrace bundle
    Standard Standard Professional 4.0 Performance Pro
    + 3.5 Memory Standard
    Any license purchased on or after December 17, 2008 N/A N/A N/A FREE
    Personal license FREE $119 $179 $179
    Per-developer commercial license FREE $199 $299 $299
    Floating commercial license FREE $1199 $1799 $1799


    In case you don’t use dotTrace, here are the prices for new dotTrace licenses.


    Table 2. dotTrace pricing: new licenses

    dotTrace 3.5 Memory dotTrace 4.0 Performance dotTrace bundle
    Standard Standard Professional 4.0 Performance Pro
    + 3.5 Memory Standard
    Personal license $149 $199 $299 $399
    Per-developer commercial license $299 $399 $599 $749
    Floating commercial license $999 $1999 $2999 $3499

    June 07, 2010

    Relaunching on Twitter

    For reasons that are too deathly boring to go into here, I’ve changed my name on Twitter. Because I ended up creating a new profile instead of changing the name (again, for reasons not worth talking about), you’ll need to re-follow if you’re interested in what I might have to say there. New location: http://twitter.com/panopticoncntrl.

    Hope to see you there!

    May 27, 2010

    Another transition&hellip;

    After spending a year and a half working on “M”, I’ve decided to make another change in what I’m doing and and move over to the SQL Server Programmability team. That’s the team responsible for things like the T-SQL language and runtime in SQL Server. Working on “M” was a lot of fun and the team was great, but after spending a good, long while down in the bowels of a GLR parser, I decided that that was enough and that it was time to do something else. Working on SQL Server programmability is, in some ways, a combination of all my previous jobs—a bit of data from Access, a bit of runtime from OLE Automation, and a bit of programming language from Visual Basic and “M”. It’s also an interesting challenge—a product that’s both well established and confronting a lot of new challenges. I think it’s going to be quite a bit of fun!

    It does mean saying goodbye to “M”, and that was sad (although, really, they’re still in the same division and not that far away), but that’s the way it goes. I’ll be looking forward to their next CTP, which is where people will see a lot of the hard work that’s been going on and the overall direction that the language is headed. There’s a lot of cool stuff coming, and I think people will find it very interesting!

    Changing jobs also means that I’m back to drinking from the firehose, learning the ins and outs of the guts of the SQL Server engine, as well as T-SQL. Interesting stuff. Any good T-SQL/SQL Server blogs anyone can recommend?

    May 10, 2010

    C# 4.0/BCL 4 Series: Complex numeric type

    This is part of a series. Note: This material is from C# 4.0 In A Nutshell Page 239.

    Like BigInteger, the Complex struct is another specialized numeric type new to Framework 4.0 and is for representing complex numbers with real and imaginary components of type double. It also lives in the System.Numerics.dll assembly. To use Complex, instantiate the struct, specifying the real and imvar  aginary values:

         var c1 = new Complex(2, 3.5);

         var c2 = new Complex(3, 0);

    There are also implicit conversions from the standard numeric types.

    The complex struct exposes properties for the real and imaginary values, as well as the phase and magnitudeL

         Console. WriteLine(c1.Real);             // 2

         Console.WriteLine(c1.Imaginary);      // 3.5

         Console.WriteLine(c1.Phase);           // 1.05165021254837

         Console.WriteLine(c1.Magnitude);    // 4.03112887414927

    You can also construct aq Complex number by specifying the magnitude and phase:

         Complex c3 = Complex.FromPolarCoordinates(1.3, 5);

    The standard arithmetic operators are overloaded to work on Complex numbers:

         Console.WriteLine(c1 + c2);      // (5, 3.5);

         Console.WriteLine(c1 * c2);       // (6, 10.5)

    The Complex struct exposes static methods for more advanced functions, including:

    • Trigonometric (Sin, Asin, Tan, etc.)
    • Logorithms and exponentiations
    • Conjugate

     

    C# 4.0/BCL 4 Series: BigInteger

    This is part of a series.

    Another new type in Framework 4.0 os the BigInteger specialized numeric type. It lives in the new System.Numerics namespace and lets you represent an arbitrarily large integer without any loss of precision,

    Since C# does not provide native support for BigInteger, there's no way to represent BigInteger literals. What you can do, however, is implicitly cast from any other integral type to a BigInteger:

          BigInteger theSecretOfLife = 42;

    That's not too useful. To represent a bigger number, such as one gooogol (10 to 100th), you can use one of BigInteger's static methods, such as PoW (raise to the power):

          BigInteger googol = BigInteger.Pow(10, 100);

    Alternatively, you can Parse a string:

          BigInteger googol = BigInteger.Parse("1".PadRight(100, '0'));

    You can implicitly cast a BigInteger to a standard numeric type and explictly cast in the other direction:

           double g1 = 1e100;           // implicit cast

           BigInteger g2 = (BigInteger) g1;    // explicit

    Calling ToString() on the googol variable prints every digit:

    using System;
    
    using System.Numerics;
    
    namespace BigInteger
    
    {
    
        class Program
    
        {
    
            static void Main(string[] args)
    
            {
    
                System.Numerics.BigInteger googol = System.Numerics.BigInteger.Pow(10, 100);
    
                Console.WriteLine(googol.ToString());
    
            }
    
        }
    
    }

    10000000000000000000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000
    Press any key to continue . . .

     

    April 22, 2010

    Using Hadoop Streaming for XML processing

    Using Hadoop Streaming for XML processing

    In a few previous posts I talked about a project that we’re working on that involves analyzing a lot of XML documents from pubmed. We’re currently not using Hadoop to parse the raw XML, however, due to the large number of documents in pubmed and the time it takes to do the parsing we’ve been discussing options that would allow us to scale up the processing to happen on multiple machines. Since we’re already using Hadoop for analysis I decided to poke around a bit to see if we could figure out a way to use Hadoop for the parsing of the 617+ XML documents.

    After some digging I came across this page on the Hadoop Streaming page that said the following: “You can use the record reader StreamXmlRecordReader to process XML documents….Anything found between BEGIN_STRING and END_STRING would be treated as one record for map tasks.”

    After a few tries I wasn’t having much success, so continued to look for alternate options. I came across Paul Ingles post on Processing XML with Hadoop which pointed me to the XmlInputFormat class in Mahout. I believe in order to use the XlInputFormat class from Mahout I either need to recompile Hadoop with that class included or be using a jar file for my jobs that includes that class. Since we’re writing our mappers and reducers in Ruby I didn’t have a jar to add the class to.

    In hopes that I was being stupid with the StreamXmlReaderRecord I decided to return to it and attempt to get it working. After configuring it I saw some positive things in the console as I ran my job. It did in fact look like Hadoop was breaking apart my XML documents into the appropriate chunks (using the start and end tags I specified in my config)

    hadoop jar hadoop-0.20.2-streaming.jar 
       -input medline10n0515.xml 
       -output out 
       -mapper xml-mapper.rb 
       -inputreader "StreamXmlRecordReader,begin=<MedlineCitation,end=</MedlineCitation>" 
       -jobconf mapred.reduce.tasks=0
    

    The next thing to figure out was how I should be retrieving the entire XML contents from within my mapper. With Hadoop Streaming the input is streamed in via STDIN so I attempted building up the XML myself using some mega-smart “parse” logic!

    #!/usr/bin/env ruby
    xml = nil
    STDIN.each_line do |line|
      line.strip!
      
      if line.include?("<MedlineCitation")
        xml = line
      else
        xml += line
      end
    
      if line.include?("</MedlineCitation>")
        puts convert_to_json(xml)
      end
    end
    

    As you can see I look for the start and end tags relevant for my XML, and once I have a complete document I pass the XML to the convert_to_json method. There’s definitely quite a bit of cleanup that can be done, as well as edge cases that aren’t handled (nested tags that match the root tag), but we’ve at least co-erced Hadoop into doing what we want. Next up is seeing how well it works when run against the entire dataset.

    April 20, 2010

    Microsoft Silverlight 4 Business Application Development: Beginner's Guide

    9768_MockupCover Build enterprise-ready business applications with Silverlight

    • An introduction to building enterprise-ready business applications with Silverlight quickly.
    • Get hold of the basic tools and skills needed to get started in Silverlight application development.
    • Integrate different media types, taking the RIA experience further with Silverlight, and much more!
    • Rapidly manage business focused controls, data, and business logic connectivity.
    • A suite of business applications will be built over the course of the book and all examples will be geared around real-world useful application developments, enabling .NET developers to focus on getting started in business application development using Silverlight.

    In Detail

    Microsoft Silverlight is a programmable web browser plug-in that enables features including animation, vector graphics, and audio-video playback--features that characterize Rich Internet Applications. Silverlight makes possible the development of RIA applications in familiar .NET languages such as C# and VB.NET.

    Silverlight is a great (and growing) Line of Business platform and is increasingly being used to build business applications. Silverlight 3 made a big step in LOB; Silverlight 4 builds upon this further. This book will enable .NET developers to feel the pulse of business application development with Silverlight quickly.

    This book is not a general Silverlight 3/4 overview book. It is uniquely aimed at developers who require an introduction to building business applications with Silverlight. This book will focus on building a suite of real-world, useful business applications in a practical hands-on approach. This book is for .Net developers, providing the answers to many questions that are encountered when creating business applications in Silverlight, ultimately enabling rapid development with ease!

    This book teaches you how to build business applications with Silverlight 3 and 4. Building a suite of applications, it begins by introducing you to the basic tools and skills needed to get started in Silverlight development. It then dives deeply into the world of business application development, covering all the required concepts needed to build sophisticated business applications and provide a rich user experience. Chapters include: building a public website, adding rich media to the website, incorporating RIA into your website, and among others.

    By following the practical steps in this book, you will learn what's needed to create rich business applications--from the creation of a Silverlight application, to enhancing your application with rich media and connecting your Silverlight application to various Data Sources.

    What you will learn from this book

  • Learn the basic tools and skills needed to get started in Silverlight 4 business application development.
  • Discover how to enhance your Silverlight business applications with rich data such as sound and video.
  • Know when and how to customize your data in Silverlight using important data controls.
  • Understand how your Silverlight business applications can connect to various Data Sources.
  • Deliver your Silverlight business application in a variety of forms.

     

    Interesting? Read the chapter 1 – Getting Started for free!!

  • March 31, 2010

    Attempts at Analyzing 19 million documents using MongoDB map/reduce

    Attempts at Analyzing 19 million documents using MongoDB map/reduce

    Over the course of the last couple weeks we’ve been developing a system to help analyze the 19+ million documents in the pubmed database. In my previous post I shared details about the process that we’ve been using to bring down the ~617 zipped XML documents that contain the articles and import them into MongoDB. Today I’m going to share a few more details about our attempts at analyzing the pubmed database using the Map/Reduce capabilities MongoDB offers.

    After completing the download, unzip, parse, and load steps required to get the pubmed articles into our MongoDB instance we set out to use the map/reduce capabilities in MongoDB to do analysis and aggregation. Our initial work has focused on the keywords and MESH headings within pubmed articles, as well as on the relationships between authors within pubmed. Our end goal is to have a profile for every author who has published an article in pubmed with details about what keywords and MESH headings appear most within the articles they publish, as well as who they commonly co-author articles with.

    In order to build this profile we set out to write a map/reduce job to count the number of articles written by each author by keyword. Our job writes the results of the map/reduce job to a named collection.

    connection = Connection.new config["mongodb"]["host"]
    db = connection.db(config["mongodb"]["db"])
    collection = db.collection("articles")
    
    map = ...
    reduce = ...
    
    result = collection.map_reduce map, reduce, 
                                   :verbose => true, 
                                   :out => "keywordstats"
    

    The “keywordstats” map/reduce job resulted in over a half million documents being inserted into the keywordstats collection.

    #keywordstatus example document
    { 
      _id: author_name, 
      keywords: { 
        keyword1: 310, 
        keyword2: 21, 
        keyword3: 22 
      }
    }
    

    The running of the keyword map/reduce analysis took approximately 30 minutes and didn’t cause us to think twice about our use of MongoDB map/reduce for our analysis. Next we moved onto doing analysis on MESH headings. Since MESH headings are pubmed’s official way of categorizing articles there are a lot more articles with MESH headings, and thus a lot more crunching for MongoDB to do. The map/reduce jobs for the MESH headings were almost exactly the same as those for keywords, however, the processing took much longer due to the larger number of articles with MESH headings assigned. When all was said and done MongoDB was able to process our map/reduce jobs for MESH headings, however, it took over 15 hours to complete (Note: we didn’t do any optimization work so its likely this could be trimmed).

    The large increase in time required to analyze the MESH headings made us start to think about what other options we might consider. However, we pressed onto our final analysis: author/co-author relationships. Our goal with the author/co-author analysis is to be able to see who authors are co-authoring with most. Additionally, we want to be able to create a network graph of all the authors within pubmed to do social network analysis on the graph. In order to create the network we need to be able to figure out who has written with one another so we can create an edge between the relevant author nodes.

    Since every article within pubmed has an author, and often multiple authors, we expected this bit of analysis to be the most taxing on MongoDB. Pretty soon after kicking off our author/co-author jobs we ran into problems. Due to the large number of author/co-author relationships and the fact that a single author may co-author papers with many other authors we were unable to get our job to run without running into the memory size limitations of documents within MongoDB.

    We evaluated other map/reduce strategies that would reduce the document size, however, the limitations that MongoDB places on the mappers and reducers prevented us from implementing those alternate strategies. To be more specific, MongoDB requires the mapper and reducer to emit the same structure. From the map phase we were emitting:

      author, {coauthor1: 1} #emit for each author/co-author "pair"
    

    And in our reduce phase we were consolidating all the co-author counts into a single hash to end up with:

    { 
      _id: author_name, 
      value: { 
        coauthor1: 31, 
        coauthor2: 211, 
        coauthor3_: 122
      }
    }
    

    We found that some authors had so many papers and thus so many coauthors that we were blowing past the size limitations MongoDB places on documents. An alternate strategy that we considered was changing our reduce stage to output a single author coauthor relationship with a count rather then our initial approach which reduced to an author with a hash containing all the coauthors with the counts. However, since we can only reduce to a single output we would need to change our mapper to emit the author/co-author as the key. Our initial attempts with this approach weren’t working well which prompted us to taken another step back to consider alternate approaches.

    Given our needs and the amount of custom analysis we want to do against this (and other largish datasets) we decided to spend some time investigating Hadoop and Amazon Elastic Map Reduce. Our initial experiences have been very positive, and have us feeling much more confident that the technology choice (Hadoop) won’t prevent us from exploring different types of analysis.

    We still feel that Mongo will be a great place to persist the output of all of our Map/Reduce steps, however, we don’t feel that it’s well suited to the type of analysis that we want to do. With Hadoop we can scale our processing quite easily, we have tremendous flexibility in what we do in both the map and reduce stages, and most importantly to us we’re using a tool that is designed specifically for the problem we’re trying to “solve”. Mongo is a nice schema free document database with some map/reduce capabilities, however, what we need for our analysis stage is a complete map/reduce framework. We’ll still be using Mongo, we’ll just be using it for what it’s good at and Hadoop for the rest.

    March 18, 2010

    Large Scale Data Processing with MongoDB Map/Reduce (Part 1:Background)

    Large Scale Data Processing with MongoDB Map/Reduce (Part 1:Background)

    Over the course of the last week I’ve been working with a member of our team to develop a prototype data processing “engine” for analyzing articles within the pubmed database. The pubmed database consists of approximately 19 million articles that can be downloaded as approximately 617 zipped XML documents.

    Our initial work has focused on downloading the complete dataset, pulling out the bits that we have interest in, and importing them into MongoDB. For our initial analysis we’re focusing on a subset of the details available for each article. In the future we’ll likely expand our analysis to include more details.

    We started by downloading the 617 zipped XML documents from pubmed. Once downloaded we unzipped each file, parsed out the bits that we’re interested in and saved the details in a JSON file optimized for importing into MongoDB. 1

    Once all the XML files were processed and the details were saved out to a JSON file, we used the mongoimport utility to import the JSON files into MongoDB.

    The above process was run over the course of a couple days. The most time consuming part was the parsing of the XML files. We wrote Resque workers to handle the above so that the work could be distributed to multiple nodes running on EC2, however, I ended up running things locally so that I could test the process. Given the pubmed database doesn’t change that often, and that we’ll rarely need to re-process the entire dataset having it run on a single machine over the course of a couple days will likely suffice.

    After importing all the articles into MongoDB we had a pretty large MongoDB database consisting of ~18 million “documents”. With the articles loaded into MongoDB, we moved onto the next step…analyzing all 18 million documents.


    1 MongoDB likes a single JSON record on each line.
    2 This is my first blog post in ages, I need to get back into it slowly, oh so very slowly! :-)

    January 01, 2010

    Microsoft MVP Award 2010 – Thank you!

    Reading my mails today:

    mvp Dear Michael Schwarz,

    Congratulations! We are pleased to present you with the 2010 Microsoft® MVP Award! This award is given to exceptional technical community leaders who actively share their high quality, real world expertise with others. We appreciate your outstanding contributions in .NET Micro Framework technical communities during the past year.

    The Microsoft MVP Award provides us the unique opportunity to celebrate and honor your significant contributions and say "Thank you for your technical leadership."

    Now, this is my fifth MVP year [2006, 2007, 2008, 2009] started in 2006 with the Microsoft MVP in Visual Studio Development / ASP.NET. I’m really surprised that I got the MVP award again. Thank you Microsoft, a big thank you to all of you!!

    2010 will be a great year with new great products! I hope I can still give my best to you and help supporting Microsoft products.

    Microsoft MVP 2010 Award for Groove:Architecture (3rd time)

    2190084222_e585244fa8_m[1]

    My thanks to Microsoft for the Most Valuable Professional (MVP) Award for 2010 - this is 3rd time in a row!  I have been active with the Groove and the CTDOTNET community for the past several years and it is always a pleasure to be helpful and work with other developers.

    Groove has evolved considerably over the years (since '01) and works well with other collaborative platforms, especially SharePoint. The new & upcoming version of Groove will be SharePoint WorkSpace 2010 – find out more about it at the Office 2010 Beta website.

    November 29, 2009

    Hyper-V Virtualization CPU compatibility utilities

    Two utilities from AMD and Intel to check your CPUs for compatibility for virtualization -

    IntelProcessor Identification Utility checks your CPU for virtualization, 64-bit and threading.

    AMD - This utility checks your system’s capabilities to facilitate testing of Microsoft Hyper-V on platforms with AMD microprocessors.

    January 30, 2009

    Network Visualization on the Web

    Network Visualization on the Web

    Over the course of the last couple months I’ve been doing quite a bit of investigation and experimentation of existing network visualization libraries. There are a number of libraries available, some open source, some built specifically for the web, others meant for a desktop environment, some in java, others in flash, and round and round we go.

    I’ve talked to quite a few people who have specific expertise in technologies for doing network visualization as well, ranging from flash to javascript to Silverlight to java. My conclusion thus far is that large scale network visualizations (300+ nodes) is hard. Once you cross the 100 node mark, you begin to have serious problems with laying out the network in a way that is usable by the user of the system that the visualization is within. Drop on top of that the desire to make the visualization interactive (zoom, click, drag, etc), as well as the desire to have the visualization software figure out the best layout for the network itself and you have a pretty difficult problem to solve.

    I’m currently doing some prototypes myself using Silverlight. I don’t love the idea of using Silverlight since I doubt the penetration of Silverlight is as great as some have proclaimed, but, the advantages it offers are hard to look past. As a long time .NET/C# developer I’m very comfortable with the development tools used to build Silverlight applications, as well as the language within which to do so, C#. Silverlight appears to offer some pretty decent performance, and I suspect that it will get better as the VM improves. The major disadvantage of Silverlight, which I don’t know the validity of, is it’s lack of existing user base. Since it’s relatively new, and not many sites use it, I suspect the installed base of Silverlight is much less then something like Flash.

    The other piece of software that I’ve been spending a bit of time with is graphvis. Graphvis is good at creating network visualizations, and supports a number of different layout algorithms. Unfortunately the output isn’t always great, and it most certainly isn’t very interactive. What I’m experimenting with is using graphvis to pre-compute the network layout, and then feeding that positional information into the Silverlight visualization. The primary advantage will be that the Silverlight app won’t have to figure out the initial layout, however, it will be able to handle all the nice visualization and interactivity that’s desired. The question still remains, is Silverlight up to the challenge? Or is flash, processing, or a pure java applet more appropriate/capable? Only time will tell….

    December 28, 2008

    Moving Gems from one version of Ruby Enterprise Edition to Another

    Moving Gems from one version of Ruby Enterprise Edition to Another

    As mentioned in my previous post I recently built a small internal micro app with merb. As part of the process of deploying that app I needed/wanted to update to the latest version of Ruby Enterprise Edition (REE) and Passenger on my slice. One of the issues I ran into while trying to update the REE version is that all my old gems where not installed in my fresh new version of REE. There may be a better way to accomplish this task, but the approach I ended up using was to modify this capistrano file (http://github.com/jtimberman/ree-capfile/tree/master) to install the gems in the old version of REE in the new version.

    October 31, 2008

    An embarrassment of riches on VB 10.0 and Oslo

    Now that we’re past the PDC, there are a bunch of video resources coming out on VB 10.0 and Oslo. Here’s a roundup of what’s available so far:

    The Pearson folks also recorded some vidcasts they call OnMicrosoft. If you go to the previous link, you can see all the videos posted, but the ones of interest to this blog are:

    There are other Oslo vidcasts on the site, so check them out as well.

    October 29, 2008

    Future Directions for Visual Basic

    Yesterday I gave my valedictory address on Visual Basic at the PDC. I think the talk went well and it was a lot of fun, if not a little sad that it’s one of the last times I’ll be giving a talk about Visual Basic. We covered a lot of exciting stuff, some of which should be familiar to readers of the blog. I’ll let people know when the video is up on the Channel9 page for the talk, should be some time today. For those of you who don’t want to sit through the talk it went something like this:

    • First, we talked a bit about the role of Visual Basic at Microsoft as the language that makes Microsoft platforms really accessible to programmers.
    • Then we segued into talking about the increased commitment that the languages groups are making to ensure that Visual Basic and C# coordinate language features so that users of one language aren’t left out in the cold when the other language adds some useful feature. This isn’t to say that we’re going to do things in exactly the same way, or even that the languages will have exactly the same feature set, but that we’re committing to ensuring that the fundamental capabilities in the languages stay in better sync than they have over the previous eight years.
    • Then Lucian did a really wonderful demo of VB 10.0, which is shipping in Visual Studio 2010. He showed (IIRC) the following features that should be familiar to the readers of this blog: array literals, collection initializers, automatic properties, implicit line continuations, statement lambdas, generic variance, and a feature that embeds primary interop assembly types in your assembly so you don’t have to deploy the PIA. I may have missed some, so check out the video when it’s posted!
    • Finally, we talked about some of the trends that we see affecting Visual Basic going forward and talked about some of the work we’re starting to do for post-VS 2010 to move the Visual Basic compiler to managed code and open it up to the world so that you can take advantage of the services that it provides.

    If you attended the talk, please evaluate the session! It helps me become a better speaker and helps us give a better PDC. And feel free to stop by the tools lounge today, I’ll be hanging out there most of the day!

    September 28, 2008

    IndiaStockQuotes Version 1.2.1

    It has been a long time since I actually worked on the IndiaStockQuotes component. Just had sometime over the weekend and fixed some bugs in the component and got out a new release. Also upgraded the component from .NET 2.0 to 3.5. I dont yet use any 3.5 specific features, so you should be able to recompile the source agains 2.0 and still get it to run.

    Check it out at India Stock Quotes

    September 26, 2008

    Moving a project from VS 2005 to VS 2008

    When you open a VS 2005 project in VS 2008, Visual Studio offers to migrate the project to the new format. Usually there should be no problem with this and all your project files, solution files, Test cases etc should move seamlessly to the 2008 format.
    Targettedframeworksetting
    But if you do build your project you will notice that your output assemblies actually target .NET Framework 2.0 and not 3.5. This is basically because the migration retains the targeted framework to make sure you application does not fail. The method to change this setting after migration is not easy to find.

    For VB projects, this setting is actually hidden inside, My Project -> Compile -> Advanced Compiler Options dialog. Obviously, this is not very easy to find. (See Image)

    In C# projects this setting is a lot easier to find in Project Properties -> Application Tab itself. I am not sure why the VB team actually made this setting so difficult to find.

    December 28, 2007

    (Almost) final VB 9.0 language specification posted

    I wanted to let people know that an (almost) final VB 9.0 language specification has been posted on the download center. The spec is missing some copy-edits from the documentation folks, but is otherwise complete. Since I'm not going to get a chance to incorporate the copy-edits until I am back from vacation in January, I wanted to get the spec out there for anyone interested in documentation of the XML features that weren't present in the previous version of the spec. (I apologize for the lateness of this vis-a-vis the release of the product itself, it's been a busy fall.)

    This updated language specification corresponds to Visual Studio 2008 and covers the following major new features:

    • Friend assemblies (InternalsVisibleTo)
    • Relaxed delegates
    • Local type inferencing
    • Anonymous types
    • Extension methods
    • Nullable types
    • Ternary operator
    • Query expressions
    • Object initializers
    • Expression trees
    • Lambda expressions
    • Generic type inferencing
    • Partial methods
    • XML Members
    • XML Literals
    • XML Namespaces

    Questions, comments or criticisms can be sent to basic@microsoft.com. Thanks!

    November 20, 2007

    Did something important happen today?

    Oh, yeah, that's right. We shipped. Hard to believe we've finally reached the finish line...

    December 20, 2006

    VB 2005 SP1 is released...

    In case you missed the announcement, VS 2005 SP1 is released. You can get it here. Beta support for Vista coming soon!

    November 16, 2006

    VB Hotfixes, now easier to get...

    I discussed a little while back that we've made a few hotfixes available to address some performance issues people have seen with VB. There's now a program that makes these hotfixes available as a regular download, rather than forcing you to call support. I'd recommend anyone running into performance problems give them a try...