Sep 20

How come you should not deploy software on a Friday? I’ve been pondering deployment issues and looking into best practices for when to time software releases (time of day, day of week). I am not considering how often to release (I think short iterations are best). It seems to me that the true answer is always highly dependent on individual organizations. Below are some points to consider which may make a good case against releasing web applications on a Friday. Again they may not apply to your organization, but you might consider them for other times/days also.

  1. In some methodologies, the release itself is pretty much a non event because of a rock solid integration environment and continuous integration practices. However, it is still almost impossible to exactly predict user reaction and usage. Thus having support (software, network, system, etc.) available can be crucial after software is released. And typically the best support personnel do not work on the weekends or late at night.
  2. A lot of groups end their iterations on Friday. Thus if they are also under the gun to deploy, there is the possibility that they may be rushing to finalize those last features. This can create a loss of quality and also create a fatigued team. Both of these risks are not beneficial to deployment. So if the team hass been pushed to finish the iteration, you may want to give them a chance to recover before you push them to do a major deployment.
  3. Related to the issue above is the situation in which the production environment is a complex one. Deployment may become a large process in which many changes have to occur at once to support the new features of the release. This will require planning and it can suffer greatly if the rush to finish the iteration takes priority. One database table change that gets missed because of the rush to meet a Friday deployment deadline can cost thousand in e-commerce. If only the planning had been better. (One way to avoid this is to make this part of the iteration with a proper LOE and priority). Scott Ambler discusses dealing with complex deployments in this Planning for Deployment.
  4. How do your users like Friday releases? Is Friday a critical time for them to use the software? If so, then they may not like the hassle of figuring out new/changed functionality.

Do you know of any other reasons to avoid certain times/days of the week for deployment?

Tagged with:
Sep 13

In Rails Session Management Howto, Part III of this series, I described memory based session storage approaches. The mem_cache_store approach provides fast access to the session data and unparalleled scaling, but doesn’t provide rock solid reliability (because it is ultimately a cache). It also maybe overkill for a lot of applications. In this post, I will discuss the final approach which is database based sessions.

There are a couple of options. The first is to use DRb storage. With the drb_store, the session data is marshaled to a DRb server. The DRb server is accessible from multiple servers so you can scale out to many servers for your application. And it is also reliable. DRb stands for distributed Ruby and more information about DRb and DRb server is available in Intro to DRb. Performance is reported to be very solid with DRb based session storage.

The second option is to utilize the built in Active Record capability of rails. I like the active_record_store because it is easy to configure and immediately provides scalability and reliability to session data storage. Performance is largely dependent on the database server infrastructure, which is a well know field and has many different optimization possibilities. Rails provides a simple way to configure the sessions by running rake db:sessions:create. Then you can just run the migration to create the table via rake db:migrate.

As pointed out by the authors of Agile Web Development with Rails, the proper choice of session storage is uniquely application and environment driven. There is an older study, by Scott Barron comparing the performance of some of these approaches. Although the results might have changed slightly, the considerations and insights are still probably very valid.

I personally use the active_record_store as my default approach. It requires no special outside expertise to implement and for most applications it is scalable and reliable. What do you use?

Tagged with:
Sep 07

In Rails Session Management Howto, Part II, I discussed using the PStore approach for session data storage. The p_store based sessions utilize the local OS file system. In this post, I will present memory based storage approaches for session management in Rails.

The first approach is to use memory_store based sessions. With MemoryStore the session objects are kept in the applications memory with no serialization necessary. While this approach will make it extremely fast for an application to move objects in and out of a session store, it is not a reliable method because the memory where the session data is stored is only available to a single server. Thus it also does not scale well since it requires sticky sessions.

The second approach utilizes memcached, a high-performance, distributed memory object caching system. Memcached is used by some of the largest websites in the world and is certainly a very solid approach for session storage. The mem_cache_store based sessions meet the criteria of scalability (just add more servers) but this approach is still not reliable. Because it is a cache you still need to use some form of reliable storage for your session data, such as a database store. But if you need super fast reads of the session data across multiple servers, then memcached is really the best performing approach. You can find some discussion of that approach discussed in Sessions. Several memcached Ruby clients are available including RMemCache.

For more discussion of these memory based sessions and their configuration, I recommend you pickup a copy of the excellent reference Agile Web Development with Rails.

Are you using a memory based session approach? How do you scale and protect against server crashes (or maintenance)?

Tagged with:
Aug 30

In Rails Session Management Howto Part I, I introduced the concepts of managing http sessions with Rails and explored the first approach, cookie based sessions. A couple of the limits were the size of the data that can be stored in the session and the lack of encryption as the data is transferred from browser to server. The next approach is to store session data in a flat file on the server in what is known as the PStore format. This format stores the serialized (marshaled) form of the session data on the file system. The location and name (actually just the prefix for the name) of the file can be configured in the environment.rb file. Refer to Agile Web Development with Rails by Dave Thomas for details on the syntax and configuration.

The benefits of using the p_store based sessions are that the data is securely kept on the server and never crosses the network between the browser and server. This provides security and also reduces bandwidth usage. The size limit of the session data is also greatly increased (limited by your system IO).

What happens when scaling the number of servers? Clearly each server can not have an individual PStore unless one is using “sticky” sessions and are willing to have users lose their session data when a server fails. This is not an optimal situation for scalable, reliable, load-balanced systems. When there is more than one server, then it is necessary to have the PStore file available to all servers because subsequent http requests may be directed to a different server each time. One way to do this is to place the PStore file on a network mounted storage system.

Thus with a p_store based session there is increased data security and reduced bandwidth usage vs a cookie based session. However, there is also now some challenging server configuration choices and network file storage. Thus it is a IO limited solution which requires a lot of optimization and monitoring. For some applications this might not be a problem and should be tested. In an application with many simultaneous sessions the number of PStore files can grow very large.

I’ll also briefly mention that there is a file_store option for sessions in Rails which also uses flat files, but it is rarely used because the session data must be strings.

Is anyone using p_store based sessions in their applications? Is it scalable? Is it reliable when servers failover?

In the next part of this series I will examine some memory based sessions.

Tagged with:
Aug 24

Today I am going to start a series of posts describing various approaches to managing http sessions with a Ruby on Rails application. Http sessions are valuable for managing state when using a stateless protocol like http. The emphasis here will be on moving toward the most resilient and scalable solutions. For more detail on sessions and Rails I strongly recommend you purchase Agile Web Development with Rails by Dave Thomas.

Sessions can contain any objects which can be marshaled (think serialization if you are used to other languages). Rails provides numerous ways to persist the session objects. The default in Rails 2.0 is to send it via a cookie to the client. Because the session is hash-like, multiple objects may be stored in the session. The size limit of cookie based session objects is 4K total.

Because the data in the session is passed with every http request from the client to the server, this can cause an increase in bandwidth usage compared to alternative approaches (discussed in the next posts). Also, even though the cookie contents are signed, they are unencrypted by default. So no secure data should be stored in the session when using the cookie_store.

If your session data is very small and the data is not sensitive, then using the cookie_store can be very effective. However, for many web applications this is not the case and fortunately Rails provides some effective solutions.

You can change the session defaults in your controllers by overriding them with the session declaration in a controller.

Tagged with:
Aug 21

Recently I learned of an issue with an application where presentation of a resultset in the UI led end users to assume the order of the resultset was a feature of the application. This is a pretty common scenario that I have seen with many applications both on the web and off. Usually it is the result of not using an ORDER BY clause in the SQL query or not having any logical column to sort by in the resultset. The problem is amplified in the case where the end user is the one who created the data in the first place and believes that the order they created the data (via the GUI normally) is perserved by the application. In time, the resultset from the scenario described above may be returned in a different order. Consult your RDMS guru for all the scenarios where this might happen. I have seen it occur during import/exports of data, database replication, and during updates to tables where all rows are affected.

Regardless of the cause, the net result is guaranteed to be user complaints about an issue with functionality (ordering of the resultset in the UI) which was never intended to be a function of the application. This is what I term accidental perceived functionality (no doubt, somebody has a better name). Please post other examples of this (so we can all learn) in the comments.

Tips to avoid this situation:

  1. Hire a good business analyst to capture all requirements (including eliminating possible points of confusion). This won’t necessary avoid the problem, but will increase the odds.
  2. Make it a habit to always include an ORDER BY clause in all SELECT statements.
  3. During table definitions always provide a column that allows ordering of results in accordance with the desired functionality (capture by requirements)
  4. Thoroughly unit test the results sets (including order) of your DAOs.

Tagged with:
Jul 28

Today I am continuing a series of posts in which I will be reviewing some of the books which are related to the development of quality software. They may be specific to a certain technology or a software development methodology.

The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt and David Thomas is aimed at developers who want to write software systems that are easy to design, build, test and extend. Andy and Dave have revealed some of the basic practices that they follow during the full software lifecycle as well as project management and career development. I am an enthusiastic fan of their work and believe this book should be read by every developer early on in their career.

This book is also one of the easiest books I have ever read. The chapters are concise nuggets of information and I found myself flying through them and then stopping to reflect on what I had just read. Even though the chapters are grouped, each chapter is mostly independent of the others and it makes it easy to stop/start without having to recall the details of the previous chapters. So, the book is one that you can pickup months or years later and go right to a chapter that you want to reread without needing to read the whole book again. In that sense, it is almost a reference book.

Good-Enough Software

Of the many gems in the book, one of them is “Good-Enough Software”, which is about creating software that users qualify as being good enough. This approach can be capture by empowering users to be a part of the process and to shift quality into requirements specs. In the Agile world this is related to producing the simple design over the complex one with the caveat that the design must work.

DRY Principle

Another great idea is the DRY (Don’t Repeat Yourself) principle. The idea is to eliminate any duplication of functionality and define a single authoritative location for it. This creates consistency, simplicity and reduces many maintenance headaches. Not only is this important in a single software system, but with the invention of Service Oriented Archictectures (SOA), you can apply it across systems.

Broken Windows Theory

One further nugget is the idea that leaving “broken windows” (bad code) unrepaired in a system will eventually compound and pretty soon you will have a whole system (building) full of broken windows and a team that doesn’t mind throwing stones occasionally. The idea is to take the time to repair your code as soon as you see an issue so that the software doesn’t deteriorate.

Consider this to be a must have for your collection and get a copy of The Pragmatic Programmer: From Journeyman to Master today.

Tagged with:
Jul 08

Today I was looking through my RSS reader of choice, Google Reader, and looking at the software development articles that I have starred. It occurred to me that you might benefit from knowing whose feed I am following. Here are my top five:

  1. Uncle Bob’s Blatherings

    This is a category on the Object Mentor Blog dedicated to Robert Martin’s writing. It is one of my favorite for both practical coding advice as well as thoughts on methodologies and mentoring programmers.

  2. PragDave

    Dave Thomas’s Pragmatic Programmer Blog is full of small tidbits of useful insight on coding and also some pointers to the books his company puts out.

  3. Martin Fowler’s Bliki

    As Martin describes it, this is A cross between a blog and wiki of his partly-formed ideas on software development. This is chock full of industry news and his thoughts on different methodologies

  4. Google Code Blog

    If you do any kind of web development, this Google blog is full of information on free APIs and techniques which will improve your skillset.

  5. Joel On Software

    This is Joel Spolsky’s site where he promotes his consultancy and views on running a software consultancy. I threw this one out there as a site to consider reading even if you don’t necessarily agree with his approach. Sometimes being able to rationally debate an idea about software craftsmanship is extremely valuable.

Whose feed are you reading?

Tagged with:
Jul 06

Since I have been posting some of my favorite refactorings (5 Great Code Refactorings) and re-reading Refactoring by Martin Fowler lately, I thought I would discuss five refactorings that I wish I used more.

  1. Move Method
  2. While I use this refactoring on a regular basis, I’d like to use it more in a specific case. Specifically I want to move more behavior into my data classes so that they don’t just contain simple get and set methods and private fields. It may even be possible to make the get and set methods private at some point. By using this refactoring in the specific case of data classes, I believe that my classes will have improved cohesion in the OOD sense.

  3. Replace Magic Number with Symbolic Constant
  4. I also use this on a regular basis when I am coding. For example, I might use the literal 1024 several times within a class. I usually will replace it with a constant that explains what it is so I only need to change it in one place. This is fine. However, I resolve to replace some magic numbers with a constant even if they are only used a single time. The reason for this is that by naming the constant appropriately, I can achieve improved readability for my code.

  5. Encapsulate Collection
  6. This is a refactoring that I have never used and resolve to try this year. The idea is to refactor a class containing a method that returns a modifiable collection into a class with a method that returns a read-only collection and has add and remove methods for the collection. In Java, this is done by using the Collections interface methods unmodifiableList(), unmodifiableSet(), etc. This produces proper encapsulation and leaves the responsibility for manipulating the values of the collection up to the class which owns the collection. Thus the coupling is now similar to that of a data class with get and set on non-collection objects.

  7. Introduce Null Object
  8. This is one of my favorite refactorings. I saw it presented at a user’s group meeting many years ago in a discussion of the null object pattern. The basic idea is to replace repeated checks for a null value with a null object. Normally you test if an object is not null and then call a method on the object. Instead of doing that you can create a null object and then call the method and get the appropriate behavior. The refactoring involves creating a null object (a subclass of the same type as the non-null object) which contains the methods needed to produce the same behavior in your code as would be done when the test of the null value for the object is true. I infrequently use this refactoring and resolve to use it much more this year. One of the biggest motivations is when looping through objects for display purposes. I can eliminate all those tests for null values that just lead to displaying blanks.

  9. Replace Nested Conditional with Guard Clauses
  10. I’ll give a very simple example of this here:
    if (i==1) { result=odd1; } else if (i==2) { result=odd2; } else if (i==3) { result=odd3; } else { result=normal; } return result can be refactored to if (i==1) return odd1; if (i==2) return odd2; if (i==3) return odd3; return normal;
    The above method reflects the unusual case where i equals 1,2, or 3. It “guards” against them. This refactoring provides more clarity within the code and I resolve to use it more.

If you have any questions about how to perform the above refactorings, I encourage you to leave them in your comments here and also I highly recommend picking up a copy of Refactoring.

I believe that each of these refactorings will contribute to better code quality based on Why Refactor. What are your favorite refactorings?

Tagged with:
Jun 26

I have been trying to determine strategies for implementing sessions via Ruby on Rails. I am particularly concerned about scalability and session replication across multiple servers in large scale sites. What is the proper choice? Here are some options for session management with Rails:

  1. No session
  2. PStore
  3. ActiveRecordStore
  4. MemCacheStore

Using PStore writes to a local file system, which doesn’t scale across multiple servers unless it is a shared directory visible to them all. This isn’t very practical across multiple data-centers. ActiveRecordStore uses a DB which means each access of the session objects may use DB resources. Again this isn’t scalable.

Thus, MemCacheStore looks like the way to go for most web applications. There is a great discussion of it by Stefan Kaes.

Anybody else using another solution?

Tagged with:
preload preload preload