Monday, 9 May 2016

ODP.NET Connection Pooling - The Black Box

I hate not being able to step into code and work out what's going on. Unfortunately, with ODP.NET connection pooling that's the exact situation that we're all in. To make it worse, documentation is rubbish. No surprise there.

When errors inevitably start occurring, finding the cause can be a tad difficult. I thought I'd share my latest debug battle in hope that it may help others.

As you probably know, connection pooling is a method of improving performance in regards to taking out connections to a database. If enabled, when you open a given connection, you're not opening a connection directly with the database, instead, you're grabbing an open connection from a pool of connections. If there isn't a spare connection then the pool will request a connection from the database on your behalf and then return that to you. When you're done with it and you close the connection, you don't actually close the connection to the database. Instead, the open connection is returned to the pool for use later.

This is a well known practice with some very misunderstood configuration options which leads to unexpected behaviour. To make matters worse, it's behaviour that's difficult to track as it's part of the "black box" which is the connection pool. All you'll know is your system will throw an exception randomly and you'll spend hours even days trying to find out what the cause is.

I want to cover the two configuration options that have caused me massive headaches. Connection Timeout and Connection Lifetime.

The options are set in the connection string, e.g.

Data Source=MyDatabase; User Id=username; Password=password; Connection Lifetime=15; Connection Timeout=30; Max Pool Size=100; Incr Pool Size=5; Decr Pool Size=3; Validate Connection=false;

Connection Timeout

This timeout value refers to the amount of time the system should wait for a connection to be grabbed from the connection pool in seconds. How long this actually takes depends on the state of the pool. 
There are three possible paths:
  1. The pool already has a spare connection so just returns it. This is by far the fastest option 
  2. The pool doesn't have a spare connection so has to request one from the database. For performance reasons, the pool won't just request one connection, it'll request an entire batch. The number of connections it requests as part of this batch is defined by the Incr Pool Size attribute of the connection string. This is obviously slower and if you've got network problems or the database server is busy doing something else then it may be that the timeout value gets hit. This will cause a "Connection Request Timed Out" exception to be thrown.
  3. It may be that the pool doesn't have a spare connection and that the pool already has the maximum number of connections that it's allowed (as defined by the Max Pool Size attribute). In this case, the pool will wait until a connection is freed up (more about that later) so it can either retirn that or make the request for a new connection without breaking the maximum pool size limit. If the time it waits exceeds the timeout value then it to will produce the "Connection Request Timed Out" exception.

Connection Lifetime

This value refers to how long the connection will live for in seconds. We're now talking about the actual connection to the database. Now think about this for a moment, the purpose of the pool is to reduce the amount of connections we open to the database. Opening a new connection is expensive and time consuming so the less time we spend doing that the better. But here we have an option that allows us to kill off connections after a given period of time. If you set this value to a low number, it severely impacts the effectiveness of the pool. You'd open the connection, the lifetime value kicks in, closes the connection to the database and nothing has actually been pooled. No matter what you set it to, you'll be limiting the effectiveness of the pool, the lower the value, the more you lose, the worse the performance.
So, why does it exist? It's all to do with clustering, there's a very good post regarding it found here. The summary of it is, do not use this unless you're connected to a cluster.

Which is where my problems began. It turns out an application I've been using had this value set to 15 seconds. The application doesn't use clustering and each user in this application has their own connection string and therefore their own connection pool. Running some rough figures then, for 5 users using the system continously for an hour, instead of opening 5 connections and using them (ignoring the incr pool size attribute here) it opens 1200 connections. Yep. That's right. 1200. Instead of 5. That's 1195 more times the application has to sit and wait for a connection to be established severely affecting performance. That's 1195 more requests creating unnecessary network traffic. And it's 1195 more times the database has to open a new connection when quite frankly, it's got better things to be doing (like returning data to your users!). That just for 5 users for 1 hour. Scale it up for 100 or 200 users over several hours and you can see how the situation gets out of hand pretty quickly. In this case we were seeing random connections fail with the "Connection request timed out" error for seemingly no reason at all... when you think about it, these are errors on an action that shouldn't even be performed. 

The obvious thing to do then was to remove the attribute, setting all connections to last forever. This works really well. The connections are taken out when the application starts up and then that's it, it just uses those connections over and over again. If it turns out that a connection isn't active, perhaps that user logs off for example, then the connection pool service which runs every 3 minutes, will close the connection to the database for you. Brilliant.
Until... a connection becomes invalid. At which point you have a connection in the pool that'll error as soon as you try to use it. This can happen in a variety of circumstances. Perhaps there's a network error or a firewall has killed off your connection at the TCP level. Perhaps the database crashed. Whatever the reason, your connection pool now has a load of connections that can't be used and unless the user logs off for 3 minutes, won't ever be cleared down. Your application won't recover.

What now? The validate connection property to the rescue! In the example connection string above, you'll see it's set to false and this is the recommended default value. If set to true, every time you grab a connection from the connection pool, the connection pool will ensure the connection is valid. To do that, it has to go back to the database server and run a query (it runs SELECT 'x' FROM DUAL) and if everything comes back okay then it assumes the connection is valid and returns it to you for use. If not, it ditches the connection and opens up a new, valid connection.
Great. We have a full solution then. Ensure the connection lifetime is set to a rather large number so that it never dies and set validate connection to true so if any connection does become invalid, it will fix it self by detecting the invalid connection, getting rid of it and opening a new valid connection. However, let's think about performance for a minute... every time we request a connection from the connection pool, we have to go off to the database to ensure the connection is still valid. If you're requesting a lot of connections then that's a large overhead. Going to the database server is expensive, running a query is even more expensive. Is there a better solution?

I've only thought of one and it's not suitable for everyone. If you use the connection as you normally would, without validating it, you can detect the failure when you run the next database query. After detection you can clear the connection pool, take out a new connection and try again. I made a central method to handle such a thing which went a bit like this.

private static readonly CONNECTION_INVALID_CODE = new int[]{ 28, 3113 };

public T ConnectionValidate(Func<T> databaseMethod, OracleConnection conn){
    int retryCount = 0;
    while(true){
        try {
            return databaseMethod();
        }
        catch(OracleException ex){
            if(CONNECTION_INVALID_CODE.Contains(ex.Number)){ 
                if(retryCount == 5)
                    throw;

                conn.Close();                 OracleConnection.ClearPool(conn);
                retryCount++;
            }
            else {
                throw;
            }

        }
    }
}

Unfortunately it's not quite as simple as that, you need to take into account if you were in the middle of a transaction or not (in which case you probably just want to re-throw the exception) but let's run with it for the time being...

The method does a few things. First, we define the oracle error codes that relate to the connection being invalid. ORA-00028 is thrown when your session has ended and ORA-03113 is for when the connection has ended prematurely. I'm sure there are others but that's a start.
We then define a method which as a parameter takes a function which returns a generic. Usually this is something like the number of rows affected by the statement, or the number of rows returned by the select. etc, using a generic just makes the method more... well, generic, we could for example, populate a DataTable and return that instead. We try and run that method which contains the oracle statement, if it succeeds then great. If it fails, we check to see why. If it's failed because of one of the errors mentioned before then we close the connection clear the connection pool which will remove all inactive connections. Then we try the whole thing again. The next time it runs, the connection grabbed will hopefully be a new connection from the database, rectifying the problem. If it fails another 4 times, we re-throw the exception and assume there's a bigger problem that we need to deal with.

You can call that method like so:

using (OracleConnection connection = new OracleConnection())
{
    ConnectionValidate(() => {
        connection.Open();
       
        OracleCommand command = connection.CreateCommand();
        string sql = "SELECT * FROM myTableName";
        command.CommandText = sql;

        OracleDataReader reader = command.ExecuteReader();
        while (reader.Read())
        {
            string myField = (string)reader["MYFIELD"];
            Console.WriteLine(myField);
        }
    }, connection);
}

Note that you have to open the connection as part of the inner function for this to work (you could move it around so that you open the connection as part of the ConnectionValidate method but I haven't).

Now we have a solution where we only validate the connection if it's absolutely necessary (i.e. it's failed when used). We only open connections to the database if we absolutely need it, reducing the network traffic, the load on the database and more importantly, we're now reducing the chance of failure (due to network connections or otherwise) when opening a new connection to the database.

As I said, my solution isn't very innovative but it seems to work. This is still very much code in testing so I could find issues with it.

How do other people handle this scenario? It seems like a fundamental issue with a lot of confusion around regarding what configuration options do what and how to handle various scenarios. Perhaps I'm missing something? I'd love to know your thoughts.

Friday, 20 November 2015

ASP.NET 5 - Embedding Resources

ASP.NET 5 has now been released as a final release candidate. As such, it's only right to start playing with things (if you haven't already, go check out www.asp.net/vNext)

The structure of projects has changed greatly in the new version and the documentation so far is lacking a bit so figuring out some of the most straightforward processes takes time.

So to make things easier, I'm going to blog about some of the most...  time consuming changes. Most of them are pretty straightforward once you know how it all works but it's getting to that stage that I want to help with.

First up... how to embed resources into a DLL.

Embedding Resources

Pre version 5, you could go to the Properties of a file and select the Build Action to "Embedded Resource". See below.


This will then modify the .csproj file, specifying that the file in question is embedded in to the DLL.

ASP.NET v5 doesn't have a .csproj file. Instead, it uses the project.json file so we need to modify this accordingly.

There's a "resource" property on the json config object. This takes an array of strings which represents a glob pattern for matching files.

As an example, for a file structure looking like this...

- Dependencies
- Controllers
- Views
- Resources
       - Files
             - Embedded.txt

You'd have a project.json file that looks like this...


...

"frameworks": {
    "dnx451": { },
    "dnxcore50": { }
},

...

"resource": [ "Resources/Files/*.txt" ]

...


That would embed all .txt files found under the Resources/Files directory. Build the project and hey presto. You've embedded your file into the DLL.

Once you've embedded the file, you probably want to read it. And that's changed too!

Reading Embedded Resources

Pre version 5, a familiar structure for reading an embedded resource would be:


var assembly = Assembly.GetExecutingAssembly();
var resourceName = "EmbedResource.Resources.Files.Embedded.txt";

using(Stream stream = assembly.GetManifestResourceStream(resourceName)){
    using(StreamReader reader = new StreamReader(stream)){
        string result = reader.ReadToEnd();
    }
}


However, in version 5, Assembly does not have a GetExecutingAssembly() method. Nor, does the Type class have an Assembly property so getting hold of the required assembly object proves rather difficult. But, with the help of the GetTypeInfo extension method on the Type object, version 5 has a solution.

Below is code to achieve the same thing as above but in version 5...


var assembly = this.GetType().GetTypeInfo().Assembly;
var resourceName = "EmbedResource.Resources.Files.Embedded.txt";

using (Stream stream = assembly.GetManifestResourceStream(resourceName)){
    using (StreamReader reader = new StreamReader(stream)){
        string result = reader.ReadToEnd();
    }
}


And that's that. As I said... nothing difficult. Just a case of knowing the differences between the frameworks.

Tuesday, 17 November 2015

Microsoft Future Decoded 2015

It was about this time last year that I wrote a rather negative blog post regarding the Microsoft Future Decoded event. The event was meant to show up and coming technologies and to help inspire developers to go off and create the next generation of awesome software.

I felt it failed to do that.

This year however, I'm glad to report, Microsoft got it right.

The day, like last year, kicked off with the key note speakers. The key notes were all presented by very good public speakers but even more important was the content being shown. All of it was relevant. All of it left me wanting to go away and research a topic or to try and come up with new and inventive ways of using a piece of new technology. More importantly, it got me thinking which is the whole purpose of the event.

There was a big emphasis on the Microsoft cloud, which in fairness, is massively impressive. There was even more information regarding the azure machine learning service and it's not surprising. The ability to recognize patterns or anomalies within a huge data set is a very powerful tool. One of which I didn't realise I had at my fingertips. Learning about how the theory behind it works, as well as the practical side of things was vastly interesting and may just prove to be the next area of computer science that I devote my time to.

The technical sessions followed the keynotes and they too were interesting and thought provoking. An area that I'm interested in at the moment is the imminent release of .NET 5. The technical tracks demo'd various new features included in that release. As an example, a simple performance test was run on the new Kestrel web server (the one that replaces IIS in Core.NET) and the sheer difference between the two was staggering.

I expect these events to be thought provoking. I expect these events to inspire and I expect these events, at times, to make me go "Wow!".

This event did all of that.

Bring on next year!

Sunday, 21 June 2015

Book Review - The Pragmatic Programmer

Well. It's been a while since my last blog but this book was too good not to shout to the world about. If you're a software developer and you're looking for ways to become better then it's a must read. It's that simple.

The Pragmatic Programmer is essentially a collection of tips which, when followed, will help you to hone your skills. It's suitable for all experience levels from the novice writing code for fun to the senior developer who has been writing code for the past 30 years. It's got something for everyone.

Interestingly, a lot of the tips discussed in the book you would have likely come across at University or in your day to day job. However, the authors have expanded on those basic concepts and applied them across the board to every facet of your day job with some pretty awesome consequences for efficiency, productivity and code maintainability.

You'll find yourself thinking about previous projects and wishing you had applied some of the tips discussed You'll silently vow to yourself to apply some of the concepts the next time you sit down at the keyboard and start writing code.

My only negative about the book is that there are a lot of tips,70 in total. As developers, our primary aim is to ship code and it's our job to do that in a timely manner, balancing time, quality and maintainability. If you attempt to do everything the book recommends then you'll soon find yourself missing deadlines... a lot of them. (Although you would have a very well designed, well documented, bug-less system... it'll just be ready about 10 years too late.)

Implementing each and every idea on every project is a practical impossibility but if you are able to implement the odd tip here and there, which could reduce the effort of documentation or designing a module to better promote maintainability, then the book would have accomplished its goals - to help you reach the next level.

Enjoy!

Saturday, 22 November 2014

Microsoft Future Decoded 2014

On a pretty miserable Wednesday, I set off for the Excel centre in London to attend the Microsoft Future Decoded conference. My aim of the day was to learn what Microsoft was planning for the future in terms of the .NET framework and to hopefully generate a few ideas that I can utilize in future projects.

The event consists of three separate sections, in the morning you had the key note speeches. In the afternoon you had various technical tracks on a range of subjects and throughout the day you had the expo.

The keynote speeches were, as a whole, very interesting. They covered quite a wide range of topics from open data, that's data that is publicly available in a machine readable format, to the technology advancements in the formula one world. The speakers themselves were all very good public speakers and the final speaker was Professor Brian Cox, he's obviously on TV a lot which should give you an idea of the calibre of speaker at the event. The problem with all of this is that very little of the content of the talks actually related to the development world and I was left wondering what exactly I, as a developer, was meant to take out of the talks. Don't get me wrong, it was very interesting learning about the theories behind how the universe began but I didn't see how that was going to help me in the near future.

As for the technical tracks, there were a variety of topics to go and listen to, ranging from "Big Data" to the "Future of the .NET Framework". The structure of these tracks was that there was three separate presentations, each one separated by a quick coffee break. Now I have an interest in various areas and it would have been good if I could have sat in on one talk from one track, then another talk in another track... kind of mix and matching topics throughout the day. The problem was that each of the talks in each track lasted a different amount of time so without turning up to a talk late, there was no way you could do such a thing.

With that in mind, I ended up going to the "Future of .NET Framework" track. It seemed a reasonable thing to do considering that will affect pretty much every project I work on in the foreseeable future. It seemed the vast majority of attendees thought the same. The session was massively over subscribed with a good portion of people having to sit on the floor or stand at the back. To make matters worse, I didn't get much out of it if I'm honest. I follow the inner workings of the .NET framework in quite a bit of detail, making sure that I'm at the cutting edge of the technology and trying to see if there's any technology that I can take advantage of on the horizon. Sadly the vast majority of things being shown I had already read and researched from various other announcements. The one big announcement of the day was that the .NET framework was being made completely open source. Don't get me wrong, this is a massive step and one in the right direction but I didn't gain any value from being there in person for this announcement, I'd have taken just as much out of it by reading about it the following day. None of the talks inspired me or generated any new ideas which I felt was pretty disappointing.

Finally, there was the exhibition. I was hoping this would be the saving grace for the day but alas, it wasn't to be. The vast majority of the exhibition was people just trying to sell services, services that were completely irrelevant to me, a developer. The only real good thing about the exhibition... the x-box stand fully loaded with Fifa. Who couldn't resist a go on that?

Overall then, I left the conference quite underwhelmed. The future definitely didn't seem decoded to me. With that said, would I go next year? I certainly would. The keynotes, although had little relevance to my day job were interesting. As for the technical tracks, I feel the key to getting the most out of those may be to pick a subject that I don't follow in such detail, big data perhaps. Doing that I suspect I'd leave the track with more knowledge then when I went in which will help me feel like it was more of a success. And as for the expo... more time on Fifa would certainly be required!

Saturday, 26 July 2014

Oracle Data Provider 11.2.0.4 - Connection Leaks

Are you using the Oracle Data Provider provided with Oracle Client v11.2.0.4? Noticed that after a period of time of use, the database is hitting its maximum number of processes and all of them have been generated by your application? You've only got a few users, how can this be? It worked fine on previous versions...

Essentially, you've got a connection leak. Connections to the database are being created but they're not being closed correctly and over time, these build up until the database starts rejecting connections stating "ORA-12518: TNS: listener could not have off client connection" or "ORA-00020: maximum number of processes (%s) exceeded".

How can we fix this?

This problem is almost always caused by not disposing of Oracle objects so you need to ensure that you've closed and disposed of all Oracle objects. OracleConnection, OracleCommand, OracleDataReader, OracleParameter, they're all disposable so make sure that they're disposed and closed after you've finished using them. This should ensure that the connection is put back in to the connection pool accordingly (assuming you're using connection pooling, if you're not, it'll just close the connection to the database).

But we didn't have this problem with 11.2.0.3 or any other previous version. What's changed?

In 11.2.0.4, the OracleConnection object must be disposed of when you're finished with it. In previous versions you would have got away with just calling the Close method. Not anymore. Dispose must be called. This isn't documented anywhere so things will just start breaking in 11.2.0.4. Cool huh?

The best way of ensuring this is to use the using statement (http://msdn.microsoft.com/en-GB/library/yh598w02.aspx) e.g.


using(OracleConnection conn = new OracleConnection()){
     // Do stuff with our connection.
}


Unfortunately for me, the application I'm working on is rather complex and the above pattern can't be used. The application is a web application and breaking it down to a very basic level, a connection is created at the beginning of the page life cycle. It's then used throughout the page life cycle and then closed and the end of the cycle. The code that closed and disposed of the OracleConnection looked a lot like this:


if(conn != null && conn.State != ConnectionState.Closed){
   conn.Close();
   conn.Dispose();
}


If we haven't opened the connection then we don't try and close it. There's no point after all, it hasn't done anything. This is fine, apart from when the Fill method of the OracleDataAdapter is called.

The documentation of this method states:
"The connection object associated with the SELECT statement must be valid, but it does not need to be open. If the connection is closed before Fill is called, it is opened to retrieve data, then closed. If the connection is open before Fill is called, it remains open."

It opens and closes the connection if it's not already open. That's great, it saves us the hassle of having to open the connection. It also means that check on the connection state is going to cause us problems. The Fill method would have opened the database connection meaning we have to call the Dispose method in order to put that connection back into the connection pool.

The solution? Place the Dispose outside the if statement and problem solved.

This is one of those frustrating problems that would have been solved in pretty much no time at all if the change Oracle made was documented. Instead, it took the best part of a week trying to identify the issue (complicated by the big black box which is connection pooling), reproducing the problem then ensuring all objects were disposed of as they should have been before finally finding the offending code path. It just shows how important documenting changes is, especially changes that have the potential to break your application!

Monday, 21 April 2014

Is 'DevOps' Killing the Developer?

This post is inspired by a blog post by Jeff Knupp entitled "How 'DevOps' is Killing the Developer'. If you haven't read it, go and do so. It's a very good read and can be found here.

For those of you that don't want to spend the time reading it, as a quick summary, Knupp argues that the 'DevOps' movement is changing what the world considers a "developer". Before, a developer would the guy that pumps out lines of code. Nowadays, due to the start-up mentality, developers are now expected to not only produce code but also to act as a system tester; a system admin; a DBA and a systems analyst, all rolled in to one. Knupp argues that this wrong. That it stops developers doing what they enjoy and what ultimately, they're good at which is producing code.

I think Knupp is wrong, to a degree, and this is why...


To take a phrase Knup uses, a 'full stack' developer is not only wanted by companies. They're needed and saying a developer should just code will affect the quality of the system they're building. I don't deny that most developers should be developers in the traditional sense but depending on the size of your team, you need at least a few that have the skills and the knowledge to be a 'full stack' developer, or in other words, a DevOps guy.

Why?

'Full stack' developers have two major plus points:

1.  They can predict problems that traditional developers can not.
As they understand the entire process, they can anticipate problems from external systems before they actually happen. This will lead to a system that is more stable. For example, they have a good understanding of the database. With this they can look at a design of a system and anticipate where bottlenecks at the database level will exist and then they have the ability to change the design accordingly.

2.  They can take ownership of any bug.
This is a huge benefit. They have the ability to find the cause of bugs that other teams don't. Bugs that require knowledge of how everything works together.
Here's a typical scenario. A customer logs a problem with their live system. The customer support team refer it to the DBA's because the problem looks database related. The DBA's look at it, can't find any immediate problem with the database and forward the problem to the developers. The developers look at it and find it works fine in the customers development environment. As the development environment and the live environment, from a code standpoint, are exactly the same, they forward it back to the DBA's. This process repeats. No single technical person can take ownership for that bug because neither believe their area is the cause. The bug takes a significant period of time to resolve at the expense of the customers happiness.
Enter the 'full stack' developer... the bug gets put on their desk. They take ownership of that bug, there's no need to bounce it from department to department, they investigate the bug and find the problem to actually be some proxy server doing something it shouldn't. For the developers this is hard to find as the tools they use to debug problems don't enable them to find such a problem. The DBAs can't find it because it turned out the problem had nothing to do with the database anyway. The network guys could have found it, but why ask them? The problem wasn't originally attributed to a proxy server. The 'full stack' developer knows all of these systems well enough to find these odd, but sometimes critical, problems and when your customers happiness is directly related to how quick that issue gets resolved, this knowledge can prove to be invaluable.

One point that Knupp makes which I do agree with is that these DevOps guys find themselves doing things other than coding. But is this a bad thing?

Generally speaking, you want your staff doing what they do best and in this case, 'full stack' developers are best at coding. You wouldn't hire a developer and put them on reception for example. However, let's analyse the nature of these DevOps guys. These guys are inquisitive, they want to know more. So when faced with a problem they don't just pass it on, they take the opportunity to learn a little more and find the solution themselves. They do this because they find the field of I.T./Computer Science/Development/whatever you want to call it, interesting and want to learn more. Over time, they've investigated enough problems, found enough solutions that they've learnt enough to be considered a 'full stack' developer. Should people like this be confined to "just coding"? Would that give them the ultimate job satisfaction? Is that enough to keep these developers at your company? After all, that's what you want to do... any developer that wants to carry on learning is one you'll want to keep.

In conclusion then, is DevOps killing the developer? I'd argue no. Being a DevOps guy takes a certain attitude and there are plenty of developers out there that are more than willing to just write code. And that's fine. That's better then fine, it's fantastic! But should we be shunning DevOps then? Again, I'd argue no. There's a place for them in any software house, an important place at that. The key to getting the best out of your DevOps guys is the real problem here. How to balance coding time to the other duties they end up performing. It's a fine line but one that if you do get right, has some great rewards.