Posts tagged Expression Engine
Over the weekend, twitter user @audiopleb ran into issues importing data to Expression Engine MSM sites via Datagrab. This is something we have a huge amount of experience in back at the office. That is to say, migrating data between sites (Non-EE -> EE and EE -> EE) is something of a specialty of mine.
One of the issues I’ve noticed with Datagrab (indeed, other add-ons as well) is how they deal with MSM sites. Datagrab for example uses this parameter to do a lot of it’s internal MSM handling:
$this->EE->config->item('site_id')
This seems to mean that if you are accessing the control panel via http://site_1/system, then site_id will always be 1. For some reason, possible depending not the version of EE/MSM/Datagrab, changing the site via the CP drop down does NOT update the site_id value (at least according to Datagrab). My workaround has been to access the CP via the secondary sites URL. For example, use http://site_2/system. This seems to get things where they want to be.
All you really need to do to make this happen is put a copy of /admin.php somewhere under site_2. I use this structure for my sites:
- /ee_system/ - site_1 system
- /html/ - site_1 web root
-
- /html/sites/
- /html/sites/site_2/
- /html/sites/site_2/dashboard/index.php - site_2 version of admin.php
- /html/sites/site_2/
- /html/sites/
Edit the site_2 version of admin.php to point all the way back to your main EE System folder via the $system_path variable. Mine looks like this:
$system_path = '../../../../ee_system';
This keeps things nice and relative for easy migrations. I’m sure there is a solid explanation for this occasional behavior, but every the I’ve hit it I have been deep into a time-sensitive work. Some day I would love to understand why this happens, and why my work-around (aka, dirty hack) seems to work so consistently.
• • •
The subject of Expression Engine Snippets versus EE Embeds was recently brought back to my mind by Jean St-Amand on Twitter yesterday. It has been some time since I’ve thought about the matter, so this got me thinking a bit more about EE performance. First, some background on how we got sideways and how we solved it. Then some zen learnings at the end.
At the office, we are now on the 3rd major version of our core web product. The second version of that product was our first EE project as a team, and my first EE project ever. I am fortunate to have come in with EE 2.1.0, and I was hooked right away. It was one of the more complex systems my partner in crime has built on EE, containing over 7,000 entries at the time we did the initial build (now well over 10,000 entries). We used every trick in the book to build this thing out, and by every trick I mean every WRONG way. Related entries for every entry, categories by NAME as part of our URL structure, and my LORD did we have some embeds. Our home page must have embedded 4 levels deep at a minimum. Some of those embeds went out and performed relationship lookups to then perform a 5th level of embeds.
Looking back at the pingdom records, page response time had pass the 5.7 second mark. 5.7 seconds. There are cars that could go 0-60 mph in that time. TWICE. AB testing from a good solid cable broadband connection showed a number closer to 11 seconds. ELEVEN. Once we factored in a normal user internet connection our response time almost doubled. I was embarrassed. The smallest site (by unique visitors) I had worked on in quite some time and it was performing this poorly? We started upgrading hosting. Then migrating. Then out came Varnish. I’m not saying that Varnish is a bad tool, but by this point I knew I was just hiding the root cause. By this point I had learned enough about EE to understand the error of my ways. Vint Cerf gave me the look of a disappointed grandparent.
I bring up Varnish for a reason. It’s good. Use it. Or Memcached. Or both. I don’t care how clean your architecture is, how fast your server. Even small amounts of traffic will clog things up on smaller VPS hosting or low-end dedicated stuff. Varnish keeps traffic from even getting into your AMP stack. Memcached can help keep things out of PHP and MySQL. Learn them. Use them. IronCache makes playing with Memcached simple, I highly recommend it. The Grist Labs team is awesome.
In the recent past we had enough time to both redesign and refactor this core site. Combining over a year of EE learnings along with a better understanding of our product and customer combined powers to generate a much better product. Everything was turned into snippets. As a rule we avoided any embeds. Ditto for any type of relationship field. We found some exceptions to the embed rule, but held firm to the relationships issue. Embeds became a tool ONLY for exposing integral data and not just for making our code “look clean”. We worked hard to understand when and where an embed was needed and when other strategies solved the problem.
The numbers tell the story. The problem version had 56 embeds in our global includes template group. The new version has 14. These 14 are also infrequently used in our code. We plan on replacing a lot of these 14 with jQuery tricks in order to take the direct embed out of EE. That will make an interesting write-up of it’s own when finished.
AB testing from a normal user connection followed this line of performance:
- Version 2 - Bare System: 11 seconds
- Version 2 - Varnish: 0.8 seconds wow, Varnish is cool!
- Version 3 - Bare System: 1.2 seconds wow, Snippets are cooler than Varnish!
- Version 3 - Memcached: 0.6 seconds
Those are some telling numbers. As I said before, use either Varnish or Memcached. Use both if you need, they are NOT mutually exclusive. After refactoring into Version 3, our site performance almost caught up to Version 2 with Varnish. Note that our data structure changed in just one way: Removal of all relationships. We used two relationships on the site, but they were used on almost every single page. Worse, each relationship was usually used to feed an embed. Terrible combination there.
The Zen Learnings
Test early and test often. In the traditional software world (think C++ or Java) testing is done at nearly every stage. My best developers always turned out code that was performance tuned. You do this by testing performance on each component as well as testing performance on the integrated whole. This can be done and SHOULD be done with web development as well. I feel ignorant for not doing this the first go around. How to test? AB is the standard tool for simulating load. Combine this with server utilization numbers (I use sar 4 5) and you have a winning combination.
How do you run the test? AB makes it easy, and it should be a standard install on Linux systems as well as Mac OS X. You run it like this:
- ab -n 500 -c 20 http://www.mydomain.com/
Pretty straight forward so long as you know that:
- -n 500 means to snag the requested URL 500 times (-n, number, get it?)
- -c 20 means to run 20 gets at a time (-c is concurrency)
- http://www.mydomain.com/ is the target URL to get. I seem to need the trailing / on this to make AB happy.
When do you test? All the time. Continuously. Test each Snippet, embed and template individually then after integrating them. Create a template that just contains a single snippet with just enough supporting EE code to test what is inside of that snippet. Use preload:replace to test embeds. Test every time you think you are “done” with a feature. Test against your local dev machine. Test against production. Test, test test test and test some more.
There isn’t a magic bullet for knowing when your code is fast enough or clean enough. Indeed it is easy to assume it IS fast enough. Always challenge that assumption. It will make you think cleaner and clearer about what you are trying to do. It will force you to learn things you didn’t know existed. It will cause you to see the world from your end-users perspective. One last bit of zen advice I learned in my performance seeking adventures:
Not all fast websites are good, but all good websites are fast.
- Me, right now
• • •
On the #EECMS zone of the Twitterz today, Matt Everson of Astuteo was having a problem with MSM and mysterious logout issues. His reasoning was that this cropped up with EE 2.3.x, but I have seen it as far back as 2.1.0. The solution is simple enough. Via the control panel visit Admin -> Security and Privacy -> Cookie Settings. You need to set the Cookie Domain as appropriate. For me I wildcard it out a bit, like this:
.mydomain.com
Repeat this once for each MSM site under the install. Problem solved? Good. Now, why is this the solution? When a cookie is issued to the client browser, it is tied to that Cookie Domain value. That is really the root of identity for each cookie. After the domain, you then have the Cookie Path and the Cookie Name.
The next bit of knowledge that helps build on the understanding here is that EE manages your login based on a Session ID number. That Session ID number is stored in a cookie named exp_sessionid. This is a simplified way to think of the cookie that results from a control panel login:
[cookie domain].[cookie path].[cookie name] = [cookie value]
If we login to Site_1 of an MSM install, we might get something like this:
[.mydomain.com].[/].[exp_sessionid] = 123456789
It is important to note that EE will use the domain for the FIRST of the MSM sites if no configured value is set for Cookie Domain across all sites. That means if you then logged in to Site_2, you get a new cookie with the new Session ID. That cookie would have the EXACT SAME identity tree, and would then overwrite the cookie set by Site_1. Now there is no longer a way for Site_1 and your browser to track the session you are with on the server and you are logged out.
Setting that unique Cookie Domain will allow you to have multiple cookies with differing domains but the same name (exp_sessionid). Since nothing gets overwritten once this is setup you are able to keep your login working.
• • •
Anyone out there use ExpressionEngine? Okay, settle down. Anyone use Pixel & Tonic’s Matrix field type? Likely that the same number of people just raised their hands. Now, how many of you custom-dashboard-matrix-using-expression-engine developers also use jQuery?
That should have been everyone as well.
One of the challenges we run into at the office is looking at the values of a Matrix field to compute, compare or validate on. For example, one might have a matrix with this format column format:
Item Description | Price
Very simple data entry example. Item and Price. At any time you might want to get an updated Total Price number to the end user. Looping over the entire Matrix is the pain point we found a nice solution to. Take this jQuery snippet as our reference bit:
$('div#field_id_106 textarea.matrix-textarea[name^="field_id_106[row_"]').filter('[name*="[col_id_38]"]').each(function() {
// Do some cool stuff here with the data. Access the local cell with
// $(this).val()
});
The jQuery warriors among us are saying “Well, yeah, duh!”. The rest of us are looking at Sanskrit. The magic are the name^= and the name*= selectors. First, we select down to the Matrix we care about. In my example, it is contained inside of div#field_id_106. Next I select down to the textarea stuff, since the fields I care about here happen to be textarea fields. Now for some magic.
[name^="field_id_106[row_"]
The operator ^ after the name attribute tells us that the selector we are isolating down to must START with the string field_id_106[row_ in order to be included. Stopping at the row_ is a trick that really made this useful for us. Both NEW matrix rows as well as UPDATE matrix rows (existing values in place) will contain this. You won’t need to know anything about the matrix you are looking at in order for this selector to work.
Last I add the .filter() to our object. This filters further to look at a specific column in the Matrix. I take advantage of the * operator here (can contain ANYWHERE), and I look for the [col_id_38] string in the name attribute. We wrap it up with the .each() call, and now we have a nicely set loop. Within the loop we can get to our cell values with:
$(this).val()
There are going to be a lot of ways to slice this turkey, and I would love to hear from anyone else that has faced this issue. Are there even cleaner ways to get into the Matrix cells via jQuery?
• • •