Organizing, managing and deploying PHP projects in subversion (SVN)
Managing projects in SVN can be difficult, especially if you are just starting out. Hopefully I can clear out some common misconceptions about source control and help you understand it a little better.
Many prefer to use command line but I find a GUI tool like Cornerstone, Versions or Tortoise to be easier for beginners to work with. For this tutorial I will be referencing Cornerstone 2. Also, this guide is meant for those who are already slightly familiar with the basic concepts of SVN. It does not go into detail about setting up SVN, though I will recommend beginners to checkout Beanstalk – a web based tool for setting up and managing SVN/Git.
How does SVN work?
All you really need to know is that SVN provides a way to store all of the revisions made on your code-base to a remote (or local!) repository. Each time you commit your code to this server it will create a revision number associated with that set of code. This allows you to compare different revisions of your code and to branch off into new lines of development without getting different versions mixed up.
SVN also makes it easy to work in a collaborative environment. Multiple coders can work on the same code-base and their changes will be merged together by SVN and given a revision number. Here are some important terms to understand about SVN:
Repository – The repository is the main folder that stores your SVN files. It is typically organized into 3 folders. Trunk, Branches and Tags.
Trunk – Think of your svn repository as a tree. The trunk is where the main code-base is found. While not all SVN setups are the same, the trunk will generally contain the freshest development copy of your code.
Branches – SVN, like a tree, can be branched. A branch is generally a line of development that contains a copy of the main code-base from the trunk but is worked on separately. Branches are great for fixing bugs, testing new functionality and adding new features without disturbing the main code-base found in your trunk.
Tags – A tag is basically a snapshot of the code at a particular revision. Let’s say you just released version 1.0 of your code and you want to make sure you have a separate copy of the code that will never change, you would make a tag of it. This way you always have a copy of the code that you deployed or checked out in case you need to revert back to it or examine it for whatever reason. A tag is typically made of either the trunk or a branch.
Checkout – Checking out code from a repository gives you a fresh copy of the latest revision of the code-base. At that point the directory on your system that you checked out the code to basically becomes a mirror of the repository. You can then start editing the code and committing changes!
Commit – Once you have checked out your code, worked on whatever it was you wanted to get done and have completed it, you would then commit the new code to the repository. Each commit adds a new revision number to the repository so you can always see what the code was like before that commit.
Update – Let’s say the last time you checked out the code was yesterday. If you are working with a team of developers then one of them may have committed a change since the last time you updated or checked out the code. Updating will bring your code to the latest version of the code. You would also typically update before a commit. This way if there is a conflict that must be resolved, you can fix it locally instead of remotely.
Conflict – A conflict occurs when SVN can’t figure out how to merge two different changes. This means you will have to manually decide what to do about that particular conflicted code. Cornerstone (and other GUIs) will show you the conflict in the code and ask what you want to do with it, typically using a function called diff.
Merge – Let’s say you made a branch off of the trunk in order to add a new feature. Once that feature is complete you would merge it back into the trunk so that it could be part of the main code base. You might also follow this same method for bug fixes.
How do I manage releases, bugs and changes in SVN?
I mentioned this briefly in my explanation of the terms above but it is important enough to go into more detail about. Before even writing your code, it can be a good idea to develop a basic build and deployment strategy to keep your code versions organized.
Here is a graphic of a recent build strategy that I used. I’ll go into detail about what all these lines mean:
In this diagram the trunk is represented by a straight line of development. This is the main code-base which is checked out and worked on for implementation of new versions and features. Since I am the sole developer on this project it is alright to use the trunk as my main line of development. However, in a multi-dev situation you would likely branch off and each dev would merge changes and fixes from their branch into the trunk. There really isn’t a set methodology for this and it’s best to simply find out what works best for your project.
In the diagram above you will see a branch off of the trunk called RELEASE-v_v. This means that when version 1.0 of the software is released, we branch out and call it RELEASE-1_0. Once we create this branch, all changes/updates/fixes that need to be done to the live site occur in this new branch and not the trunk. The trunk is instead used for the new version and features of the site. However, you will notice that if a bug fix is made on the RELEASE-1_0 branch it is merged back into the trunk. This will keep the core code in the trunk up to date with the live site throughout development.
There was also a TAG created when we made the branch for RELEASE-1_0. This is good practice because it gives you a snapshot of what the software was like at key moments in the development process. In this case it’s a snapshot of the initial release just before it was deployed.
You will also notice the I am branching off for major bug fixes to the main code base. This is good practice because, when you run into a major bug, you may need the option to revert back to the code PRE bux fix or POST bug fix. Branching off and creating snapshots of both PRE and POST are a great way to revert back if necessary or to just view the code at a later date. Once the bug is fixed, that branch is merged back into the trunk and the bug branch is deleted (remember we still have our PRE and POST tags though).
Many beginner’s have a hard time understanding the concept of deployment to a web server. This is because they have typically used FTP to update their production copy of a particular code-base. This can be confusing and cause a lot of problems. The solution is deployment. There are many choices when it comes to deployment software (Hudson, Capistrano, Phing) but I typically use Beanstalk. They offer a simple interface and setup for web deployments. All you have to do is input your ftp credentials and manage some settings and they do the hard work for you!
I usually have a branch in my repository that I deploy from to specific environments (top right of the deployment graphic). Basically, when the code is ready I just merge the new changes into the branch I would like to deploy from (qa, stage and prod usually) and that is where my deployment is set to run.
Here’s the typical deployment process that I use:
- Initiate deployment (usually by clicking a Build or Deploy button within the deployment tool)
- Mkdir www/new.example.com on your web server
- Upload latest codebase to www/new.example.com (This is handled automatically via FTP by Beanstalk)
- Rename config files (in my case I run mv config/core.prod.php to config/core.php and config/database.prod.php to config/database.php).
- Copy www/example.com to backups/example.com.bk (Creates a backup of the current version of the website. This is great if there is a problem because all we have to do is make this backup live and we are back where we started!)
- Mv www/new.example.com to www/example.com. This will make the latest uploaded copy of the site go live immediately.
Notes on deployment
Deployment software will typically allow you to run a script before the deployment and after the deployment. In Beanstalk’s case they are called hooks. These files are where you run the above commands to get your deployment all set up. Once you have your hooks ready to go and tested then all you have to do is click “Deploy” and viola! Your latest code is live!
Also, you typically want to keep your config files (like core.php and database.php in my case since I typically use CakePHP) out of your repository. Instead you should keep a copy of each called core.dev.php and database.dev.php. When the code is checked out, the dev can rename the appropriate files so that they can customize these files for their setup. We use a similar process for deployment to qa, staging or production, respectively.
Well I hope this clears up some of the confusion surrounding SVN. If you have any questions or criticisms feel free to leave me a comment or ask me on twitter. Thanks for reading!