Wednesday, 31 July 2013

JavaScript - Memory Leak Diagnostics

Memory leaks in JavaScript seem to be becoming an ever increasing problem. This is no surprise with JavaScript being used more and more but have you ever tried to solve a memory leak in JavaScript? It's no simple task, the tools simply don't exist to help determine what objects are leaking. You're essentially trying to find a needle in a haystack while blindfolded. Not Cool!

Until now.

There are three tools I would like to talk about. Sieve, Google Chrome's Heap Snapshot and the new boy on the block, Internet Explorer 11 Developer Tools.

In most cases, as a developer, you need a bit of a nudge in the right direction. Once you have an idea of where the problem may lie, we're pretty intelligent, we can usually work it out. Sieve gives you that nudge.
It's a memory leak detector for Internet Explorer. When running it'll show you all the DOM elements that are currently in memory. It'll then go one step further and it'll show you the DOM nodes that are currently leaking with an ID and everything. You can then use that information to try and find out why it's leaking. Usually it's because some piece of JavaScript some where references the element which has since been removed from the DOM.
sIEve - Memory Leak Detector

I must admit, I used this on a complex web application which had popup windows with iframes inside iframes and if it didn't crash then it did report some nodes were leaks when they weren't but, it did at least give me that nudge to look at a particular screen.

Chrome Heap Snapshot
Now we're talking! This is, to my knowledge, is the first proper way of determining what objects are specifically leaking. It allows you to take a snapshot of what objects are in memory at a given point in time and then you can compare these snapshots.

Chrome Heap Snapshot
This is rather handy. It means you can see which objects were created in the first snapshot and still exist in the second snapshot, i.e. the ones that are causing your problem!

The good thing about these snapshots is that it also tells you the "retaining tree". This is essentially the path from the root objects to the object in question, this means you can trace the path and work out why your object isn't being garbage collected.

The tool has a few other ways of helping you find your leak if comparing snapshots isn't quite cutting it. There is a "containment" view and a "dominator" view. I haven't had much use for the containment view (see here for more details) but the dominator view essentially lists the objects with the biggest memory consumption which can be helpful if you've got leaking global objects.

And a late entry.... Internet Explorer 11 Heap Snapshot
A developers preview has just been released on Windows 7 but so far so good. It's much the same as Chrome's version, if a little easier to read.

Internet Explorer 11 Developer Tools
There are two exceptions, firstly, on a positive note, it has a search functionality which Chrome doesn't have. This allows you to find objects that you know the id of. On a negative, it seems you can only compare sequential snapshots. You could not for example, compare your first and third snapshot which means you have to really think about when to take a snapshot.

I haven't had much time to really play around with this and it is only a developers preview but so far it looks like it could be a very useful tool. In actual fact, the whole new developer tools has a real potential but that's another blog post for another day.

For more info on the memory tab within the developer tools check out the MSDN documentation.

Conclusion...
As always, use the best tool for the job. For simple leaks, sieve is very good in finding the problem. For more complex problems the heap snapshots are the way to go.

The work Google and Microsoft have done in this area recently show how big JavaScript has now become and these tools are a great addition to any web developers tool kit.

If you do ever have to look for a memory leak, my thoughts are with you.

Good luck!

Sunday, 26 May 2013

Web App Upgrade From .NET 3.5 to .NET 4.5

We've recently gone about upgrading our web application from .NET 3.5 to .NET 4.5 and as you could probably guess, it didn't quite go as smoothly as one would hope.

As we go through this process I'm going to blog about the difficulties and what we did to overcome them.

So, here we go...

System.Web.UI.HtmlControls.HtmlIframe


This is a whole new type in .NET 4.5 and oddly, it can cause a few problems.

Take this line of code for example:

<iframe src="about:blank" id="myFrame" runat="server" />

If you wanted to refer to this control in C# code, in 3.5 you'd write something like this (preferably in a designer.cs file):

HtmlGenericControl myFrame;

In .NET 4 however, an iframe is no longer an HtmlGenericControl, it's an HtmlIframe which does not inherit from HtmlGenericControl. This means you need to change the above line of code to something that looks like:

HtmlIframe myFrame;

Creating this HtmlIframe class makes sense and means that iframes have their own object, much like the HtmlTable class but, it does seem odd that it does't inherit from HtmlGenericControl. Unfortunately, this design decision has knock on effects for upgrades. Any iframe which has been defined as an HtmlGenericControl now needs to be changed to an HtmlIframe. To make matters worse, if you've manually defined these controls and they're not wired up via an auto-generated designer file, then the problem won't be picked up at compile time. You'll need to actually run the application and wait for it to fall over to find the problem.

The joys of upgrades eh?



Saturday, 23 March 2013

Stack Overflow - Much more than just answers

I'm guessing everyone who is reading this knows what Stack Overflow is but I bet most people aren't getting the most out of what really is a very useful tool for developers.

For those of you that don't know, Stack Overflow is in it's simplest form, a forum. Developers post problems or questions that they can't find the solutions for and other developers answer them. Those questions and answers are then stored so that anyone with a similar question can find the answer. After years of this, Stack Overflow has built up a pretty comprehensive archive of common problems that developers have faced and the solutions to those problems. It's one of the reasons that if you search for a development related problem on the internet then nine times out of ten, Stack Overflow is the first hit. It's a great idea and it's been well executed.

So, why am I writing a blog post about it? You already know all that. Well, up until recently that's all I knew about Stack Overflow as well. Until I actually decided to give something back.

I registered for an account and I thought I'd try and answer a few questions. I may not be the greatest programmer in the land but I do have a fair amount of experience with various technologies/frameworks so I should be able to answer the odd question or two. Turns out I was correct, I can answer the odd question. What's more, it's addictive.

Just about every interaction within the Stack Overflow community gives the rest of the community to give you reputation points. Someone likes your answer? They'll vote it up. That's 10 points. Your answer gets accepted as the correct answer, that's 15 points. And the same works in reverse. If you post a load of rubbish, you'll get voted down and that's minus points. Why is this important? Well, the entire website is moderated by the community and these reputation points gauge what you can and can't do in order to help that moderation. I suppose in essence, it runs a bit like humanity does in Star Trek, in the words of Picard "We work to better ourselves and the rest of humanity". I wouldn't class it as work but, you get the idea.

As I said, answering questions becomes addictive because the more you answer, the more reputation points you get. This leads you to reading a lot of questions, a lot of which you won't be able to answer. This is good. Very good. Why? Because you learn a lot. Just by reading questions and their corresponding answers I've learnt all sorts of things, in fact I wish Stack Overflow had the ability to "notify you via e-mail when an answer is posted", there are so many good questions that get posted. I've found better ways of solving problems I solved years ago, I've read questions to problems I haven't even come across yet. It really is a great tool for learning.

In conclusion, it's a good tool for learning to communicate accurately to fellow developers. It gives you the ability to give back to the development community and it is a great tool for learning. So, if you're waiting for something to compile or you've just got a spare 5 minutes, head over there. Try and answer a few questions, read other questions and just learn.


Tuesday, 19 February 2013

IE8, Filters and IFrames

Everyone loves supporting old versions of Internet Explorer right?

Well, I came across an odd "quirk" with IE8 and it took me a little bit of time to track it down.

The problem occurs when you use a DropShadow IE filter, an Alpha filter and an iframe. When you stick them all together, the iframe becomes totally transparent. Very odd.

Let me walk you through it.

So, the set up is that we have a normal page, that page consists of a div with a DropShadow and that div contains an iframe which loads a new page. In that page, we have an overlay which has a 100% transparency set. When you strip out all the complexity, you end up with two HTML pages that look a little like the below. I've given each page a background colour just to make the problem a little more obvious.

Main.html

<html>
<head>
<title>Top Window</title>
</head>
<body style="background-color: Green;">
<center>
The top window
<br />
<br />
<div style="position: absolute; top: 55px; left; 25px; z-index: 1">Some general text in the top level window</div>
<div style="border: 1px solid black; z-index: 2; position: absolute; top: 50px; left; 20px; filter: progid:DXImageTransform.Microsoft.DropShadow(OffX=5, OffY=5, Color=#888); width: 400px; height: 400px;" >
  <iframe src="IFrame.html" style="width: 400px; height: 400px;" frameborder="0" />
</div>
</center>
</body>
</html>


IFrame.html

<html>
<head>
<title>Inner Frame</title>
</head>
<body style="background-color: blue;">
Text within the iframe

<div style="position: absolute; left: 0px; top: 0px; width:400px; height: 100%; filter: progid:DXImageTransform.Microsoft.Alpha(Opacity=0); background-color: black;">
  This is an overlay within the iframe.
</div>
</div>
</body>
</html>


All pretty straightforward so far right?

Wrong.

Here's two images of the what the above produces. One is produced by IE8 and the other, IE9.

Internet Explorer 8 Internet Explorer 9

See the problem? The entire content of the iframe has become transparent. That's not what we wanted at all! IE9 on the other hand, renders it correctly.

The solution? Remove the DropShadow on the div. Ok, so it doesn't look as good but at least it gives a consistent look across IE browsers. You can always reproduce the box shadow effect using a different method, perhaps a second div that has a grey scaled colour, placed underneath the div containing the iframe but with a bit of an offset, I'd imagine that'd have the same effect although I haven't actually tried it.

Oh the joys of old versions of Internet Explorer.

Tuesday, 8 January 2013

HttpHandlers and Session State

By default, if you create a new HttpHandler, it does not have access to the session object. Take the following as a very simple example:


public class MyHandler : IHttpHandler
{
    #region IHttpHandler Members

    public bool IsReusable
    {
        get { return true; }
    }

    public void ProcessRequest(HttpContext context)
    {
        string s = (string)context.Session["MySessionObject"];
        context.Response.Write(s);
    }

    #endregion
}


Do you see the problem? HttpContext.Current.Session will be null and an exception will be thrown.

So, how do you access the Session object from within an HttpHandler? I've tried all sorts of magical workarounds, some worked, some didn't but by far the easiest is just to simply add the IReadOnlySessionState interface to your handler, so it'll look like this:


public class MyHandler IHttpHandler, IReadOnlySessionState
{
    #region IHttpHandler Members

    public bool IsReusable
    {
        get { return true; }
    }

    public void ProcessRequest(HttpContext context)
    {
        string s = (string)context.Session["MySessionObject"];
        context.Response.Write(s);
    }

    #endregion
}


And as if by magic, your session object is populated and you can access your session objects like you usually would. Fantastic news! You can't write to the session object by the way, but I've not come across a scenario where I've needed to yet.

Thanks to Scott Hansleman's blog for the solution to that little problem!

Saturday, 5 January 2013

League Predictor

For those of you that don't know, I play a fair amount of football and I run the website for the Sunday league team for which I play, Sumners Athletic. For many years I've used that site, and the server it's hosted on, to test new technologies and new methodologies and to improve my understanding of other web technologies.

Every now and again, when I'm playing around I create a control that I can share with the world. It's usually not the polished article but it does a job. Five or six years ago I created one of these controls. I created a "league predictor" in the form of a Java applet (who remembers those?).

What is a league predictor? Well, for a football team like mine that play in a league and each team plays each other twice (home and away), given all the results of the league to date, will work out the remaining fixtures. Those fixtures are then presented to the user so that they can make predictions about those games. At the end, the user hits a button and then re-draws the final league standing based on the predictions that the user has made.

Well, unless you've been out of the web application loop for the past 5 years, you'll probably know that Java applets are all but dead. JavaScript and HTML5 are the way forward so, I thought I'd re-write that original control using those technologies. The re-write is now done so I thought I'd make the code available to all. At some point I'd like to add animations to the league re-draw but that's another post for another day.

Anyway, the JavaScript file and corresponding CSS and HTML files can be found in this zip file.

If you want to see a working example, check this out: Sumners Athletic League Predictor
(Please ignore the fact that my team, Sumners Athletic, are currently bottom of the league!)


Just a few notes about the "control"...

Firstly, the JavaScript requires the initial data to be able to work out the league and what fixtures need to be played. To do this, it needs what I call, a results matrix in the form of a CSV file.

What's a results matrix? Well, it's essentially a grid of all the games played. You can imagine it as a grid, with the team names forming the first row and the first column. The results of each game (assuming you play home and away once) can then fit in the corresponding cells. If you open the CSV file included in the above zip file within a spreadsheet, you'll see what I mean.

This initial data is then requested by JavaScript and processed. It converts the CSV file into a 2D array and constructs team objects from that array. A team object consists of the name of the team, the amount of games played, won, drawn and lost. From this, a league can be constructed.

Then, from the results matrix, we can work out what fixtures are remaining. We can then display these fixtures to the user. Once they hit the "Predict" button, the predictions made are then fed back into the results matrix and the league is re-drawn based on the new results matrix.

And that's it.

Nothing too complex but I thought I'd share the code. I remember when I first started building my first football team website, I looked around for something that would do just this and couldn't find anything, now there's an option out there!



Sunday, 9 December 2012

Web Apps and Pagination Queries

Paging controls are found on many popular web apps.
Amazon is a great example.
In a fair few web applications out there, you'll see the concept of pagination. That is you have a list of results and they'll be separated into pages which show a limited number of results (say 25 records per page). You can then usually go forward a page, go back a page or the first or last page. This is a common UI practice and I'm sure you've seen it before.

But, how is this implemented and what is the best way of doing so?

Well, the results shown on the page are usually held against a database. So the question quickly becomes:

"How can I query the database to bring back a given page of results?"

Now before I carry on lets first state that this blog post consists of database queries that have been run against an Oracle database (both the 10 and 11g versions). There's a strong possibility that they will not run against SQL Server, or any other database for that matter, but I suspect the general concepts remain the same. I should also state that I am by no means an Oracle expert. Yes, I know SQL reasonably well, but an expert? Far from it. So, if you find a better way of implementing the following features then it'd be great to here from you.

Now, back to the task in hand. We need to write a SQL command to run on an Oracle database that will bring back a "page" of data of a particular table and as an extra condition, the data returned must be in a given order. Let's also say that a page consists of 25 records.

You'll be surprised at how many ways there are to achieve this. In the 4.5 years that Qube have been using this functionality, our pagination queries have changed no less than four times. Some times we've made the odd small change to optimize the query, in other times we've re-written the whole thing and used a different technique. In each case, we've improved performance over the last iteration of changes. This is done with the goal of making our web application as fast as possible. Finally, we've now got to the stage where using one technique is faster on one table but slower than a different technique on another. This means there is no technique that's best for all tables. It'll depend entirely on the amount of data you have in each table, the indexes you've set up and the way in which your database is optimized.

I'm now going to take you through our little journey of finding the best possible pagination query. In all of the example queries I've highlighted the base query (the query which we want to paginate) in blue. Generally, the following methods wrap that base query up in order to implement paging. You'll need to change the base query to be the actual query you want to use pagination on.

Revision One - Retrieve All

So, our first option was to retrieve all records and let the application handle the pagination. So, if we have a table that consists of 100,000 records, we retrieve all records and then the application will display 25 at a time. The bonus of this method is it's extremely easy to implement. Also, moving from one page to the next is extremely quick, all the records are held in memory so navigating between them is simple and fast. There are however, some major downsides. For starters, retrieving all records is slow so the initial load time for the user is severely affected. If that's not bad enough (and it should be), the memory usage of such a technique is outrageously large, especially when you consider the fact that it's highly unlikely a user will want to navigate through the entire 100,000 records anyway, so you're using valuable memory on records that are never going to be shown. Not cool! Just with these two things in mind, this technique is going to scale very badly.

Example Query
SELECT COLUMN_ONE, COLUMN_TWO, COLUMN_THREE 
FROM TABLE_NAME 
ORDER BY PRIMARY_KEY

Revision Two - Nested Queries

Bringing back everything clearly isn't the way forward. So, with a little bit of use of ROWNUM (an Oracle feature which brings back the number of the row in a dataset, e.g. the first row has a number 1, the second has a number 2, etc.) we can bring back 25 rows at a time which will give us the paging functionality that we need.

Using this method the application doesn't need to store all the possible rows which lowers the memory consumption on the application significantly. It also improves the performance of the first page load. Instead of bringing back thousands of rows, the database only returns 25 rows and there's obvious performance implications for that. The downside however is that the performance of the next/last/first/last page buttons will be affected as the query will need to be re-run but for the next page of data. If all the rows were stored on the application from the original query then this additional database query wouldn't need to be run.

Example Query
SELECT * FROM (SELECT rownum as f2n_rownum, f2n_table.* 
               FROM (SELECT COLUMN_ONE,
                            COLUMN_TWO,
                            COLUMN_THREE 
                     FROM TABLE_NAME 
                     ORDER BY PRIMARY_KEY) f2n_table 
               WHERE rownum <= 25) 
WHERE f2n_rownum >= 1

Revision Three - Nested Queries Using WITH

As always, speed is key!
We've now got an implementation of our pagination query but is it the best implementation? Did you know that your average user expects your web application to load in two seconds or less and up to 40% of your users will leave your site if it hasn't responded after three seconds.1 That means these queries need to be as quick as possible, every second counts. When you're dealing with possibly thousands of records, it can be difficult to bring back results in that time frame.

So, we looked to see if we could improve the performance of our pagination query. If we can then that's a performance improvement across near all of our pages. As it turns out... there is a better way! Kind of.

We could improve the performance in two different ways. Firstly, we can use the WITH clause. This is known as subquery factoring.2 The second improvement is that we can tell the oracle optimizer how many rows we intend on using, this allows the optimizer to use this information to choose a faster explain plan for what we want. We do this by using optimizer hints in the query and in this hint we tell the optimizer that we want the first 25 rows brought back first (or however many rows are contained in your "page" of data). For more information on this try here: www.orafaq.com.

Example Query
SELECT /*+ FIRST_ROWS(25) */
       pageouter.* 
FROM (WITH page_query AS (SELECT COLUMN_ONE,
                                 COLUMN_TWO,
                                 COLUMN_THREE
                          FROM TABLE_NAME
                          ORDER BY PRIMARY_KEY
      SELECT page_query.*, 
             ROWNUM AS innerrownum 
      FROM page_query 
      WHERE rownum <= 25) pageouter 
WHERE pageouter.innerrownum >= 1

Revision Four - Nested Queries With ROW_NUMBER

Ok, so we're now using optimizer hints and we're using subquery factoring. All good stuff. But can we do more?

Well, we can. Kind of.

There is a SQL function called ROW_NUMBER(). It serves very much the same purpose as ROWNUM in Oracle but it works in both Oracle and SQL Server and oddly, when used can perform better than our previous queries but only in certain scenarios.

The problem here is I can't tell you why it performs better in certain scenarios, I can't even tell you in which scenarios it performs better but here is what I have found:

  • It seems to perform better than our previous methods if the query is modified to have a complex 'where' clause.
  • It seems to perform better than our previous methods if the data is ordered by a row that is not uniquely indexed.
  • The performance gains can be dramatic. In the previous examples, changing from one method to another may have seen an improvement ranging from nothing to a second or two. I've seen this method improve some queries by up to 5-8 seconds, especially on queries that order data by columns that aren't indexed.
Now I suspect all of this is very much dependent on the indexes you have set up on your tables, the amount of data in your tables, how you've got your database optimized and probably a  fair few other factors that I have no idea about so, the best way to know how this will perform for your queries is to test it.

Example Query
SELECT /*+ FIRST_ROWS(25) */
       *
FROM ( SELECT ROW_ONE,
              ROW_TWO,
              ROW_THREE,
              row_number() OVER(ORDER BY PRIMARY_KEY) innerrownum
       FROM   TABLE_NAME
     )
WHERE innerrownum BETWEEN 1 AND 25

Conclusion

I've shown you three different ways of implementing pagination within the database query. There are other ways which I haven't discussed. For example, you could follow this process:
  1. Run the query for ALL records (no paging) but insert the results of that query into a temporary table.
  2. Query that temporary table for the "page" of data that you want, using one of the methods above.
  3. When implementing the next/previous page function, you can then query the temporary table directly. 
Assuming that your original query isn't bringing back the entire table, you'll be selecting from a subset of the original data which should make the next/previous functionality faster. However, your original page load time will be slower as you'll need to insert the records into the temp table so, there's a trade off. 

I would imagine there's loads of other ways of doing this, if you find any that perform better than the above then let me know, it'd be great to hear from you!

And finally.... SQL Server

I couldn't end without mentioning the latest version of SQL Server and the good work Microsoft have been doing in this area. Microsoft have cottoned on to the fact that this paging functionality is now widely used and, as you can tell by this article, it isn't straight forward. They've gone out of their way to simplify this and built this functionality straight into the language making it very simple and, I would hope, a whole lot quicker than anything we can write in standard SQL.

I can't say I've had the pleasure of testing this but, according to the documentation, the feature is implemented by the introduction of two new keywords, OFFSET and FETCH NEXT and they're used in the following way:

SELECT COLUMN_ONE,
       COLUMN_TWO,
       COLUMN_THREE
FROM   TABLE_NAME
ORDER BY PRIMARY_KEY
OFFSET 0 ROWS
FETCH NEXT 25 ROWS ONLY

This tells the database to bring back the first 25 rows. To bring back the next page, you'd increase the offset by your page size (in our example, 25). For more info, check out raresql.com.

And it's that simple.

The sooner Oracle implement this functionality the better!


1 - Forrester Consulting, “eCommerce Web Site Performance Today: An Updated Look At Consumer Reaction To A Poor Online Shopping Experience” A commissioned study conducted on behalf of Akamai Technologies, Inc., August 17, 2009
2 - For more information on subquery factoring, see www.dba-oracle.com