Tech Blog :: php


Mar 24 '11 10:34am

Track Freshbooks Expenses in Google Docs with PHP and XML

I've been trying to automate as much of my financial forecasting as possible, with coding up front that will last a while. My primary tools are Freshbooks (for expense and invoice tracking) and Google Docs for spreadsheets. I wrote yesterday about pulling data from one spreadsheet into another using importRange. Last night I took it several steps further, pulling expenses from the Freshbooks API into XML, then XML to GDocs, and automating tax calculations based on expense category.

1. Freshbooks Expenses to XML

Building on an existing freshbooks-php library, I wrote a PHP script called freshbooks_expenses_xml. (Link goes to GitHub.)

To get it set up, create a keys.php file, and put the whole package on your server somewhere. Play with the parameters described in the readme to get different XML output.

2. XML to Google Docs

In cell A1 of a clean spreadsheet, enter this function:
=importXML("http://your-site.com/freshbooks-expenses/expenses.php?date_from=2011-01-01&date_to=2011-12-31&headers=1&", "//expenses/*").
GDocs will fetch the data and populate the spreadsheet. (Note: I had some trouble making the headers consistent with the columns, and worked around it; you might want to do the same by omitting headers=1 in the function and putting in your own.)

3. Making useful tax calculations with the data

For estimated quarterly taxes (as an LLC), I need to know my revenue (calculated in another spreadsheet, not yet but possibly soon also pulled automatically from Freshbooks) minus my business expenses. As I learned doing taxes for 2010, not all expense categories are equal: Meals & Entertainment, for example, is generally deducted at 50%, while others are 100%. This is easy to do with custom GDocs functions. Next to my expenses (pulled automatically), I have a column for Month, a column for Quarter (using a custom function), and a column for Deduction, using the amount and the category. (To write a custom function, go to Tools > Scripts > Script Editor.)

Finally, in my income sheet, I use sumif() on the range in the other [expenses] sheet with the calculated deductions for that quarter, times my expected tax rate, and I know how much quarterly taxes to pay!

(Update: A revised version of this post now appears on the FreshBooks Developer Blog.)

Mar 13 '11 7:56pm

Switching PHP memcache.so extension to memcached.so

I was having some caching issues earlier that I concluded were memcache-related. The memcache terminology is confusing: 'memcache' is the colloquial name, 'memcached' is the daemon, and php has both memcache and memcached extensions. The memcache module for Drupal supports both, but recommends the memcached version. I was running the other one, so I decided to switch to see if that would fix my problems.

The swap was harder than I expected, so here's how I did it, in case anyone else wants to do the same. This assumes you already have the daemon and old memcache library working correctly.

First try the simple method. This didn't work for me because I didn't have libmemcached installed. If it works for you, you're lucky:

sudo pecl install memcached
(I specified the version, memcached-1.0.2, to make sure I got the latest stable release, but that number might change by the time you read this.)

Anyway, that didn't work for me - I got an error, Can't find libmemcached headers". The documentation specifies a --with-libmemcached-dir parameter to handle this. But I didn't have the library installed anywhere, so I had to install it. (Fully install it, not just download it.)

Using /opt to hold the files, the latest version of libmemcached, and running as root (otherwise add sudo to each line, or at least to the make install step).

cd /opt
wget http://launchpad.net/libmemcached/1.0/0.40a/+download/libmemcached-0.40.tar.gz
tar -xzvf libmemcached-0.40.tar.gz
cd libmemcached-0.40
./configure
make
make install

Now try the simple method again: sudo pecl install memcached. If that still doesn't work, specify the directory manually:

cd /opt
pecl download memcached-1.0.2
tar zxvf memcached-1.0.2.tgz
cd memcached-1.0.2
phpize
./configure --with-libmemcached-dir=/opt/libmemcached-0.40/libmemcached
make
make install

(Play around with the configure line there if it still fails. I tried 100 variations until I got it working - I think with pecl install after the full make install on libmemcached - but your results may vary.

If this worked, there should now be a memcached.so file in your PHP extensions directory.

Now for the php config: the documentation on memcached's runtime configuration is sparse. The Drupal module recommends setting memcache.hash_strategy="consistent", however, I'm not sure if this has any effect on memcached.so. In my setup there was a conf.d/memcache.ini file, symlinked to cli/conf.d (for command line config) and apache2/conf.d. I changed the extension call to the new file, removed extraneous configs that didn't seem to be documented anywhere, and set the hash_strategy for good measure. Then I checked the config with apache2ctl configtest (will differ by distro), that checked out, so I restarted apache. phpinfo() showed the new extension, my caching problem went away, and all seems well so far.

Feb 23 '11 11:20am

Drupal as an Application Framework: Unofficially competing in the BostonPHP Framework Bakeoff

BostonPHP hosted a PHP Framework Bake-Off last night, a competition among four application frameworks: CakePHP, Symfony, Zend, and CodeIgniter. A developer coding in each framework was given 30 minutes to build a simple job-posting app (wireframes publicized the day before) in front of a live audience.

I asked the organizer if I could enter the competition representing Drupal. He replied that Drupal was a Content Management System, not a framework, so it should compete against Wordpress and Joomla, not the above four. My opinion on the matter was and remains as follows:

  1. The differences between frameworks and robust CMSs are not well defined, and Drupal straddles the line between them.
  2. The test of whether a toolkit is a framework is whether the following question yields an affirmative answer: “Can I use this toolkit to build a given application?” Here Drupal clearly does, and for apps far more advanced that this one.
  3. The exclusion reflects a kind of coder-purist snobbery ("it's not a framework if you build any of it in a UI") and lack of knowledge about Drupal's underlying code framework.
  4. In a fair fight, Drupal would either beat Wordpress hands-down building a complex app (because its APIs are far more robust) or fail to show its true colors with a simple blog-style site that better suits WP.

Needless to say, I wasn't organizing the event, so Drupal was not included.

So I entered Drupal into the competition anyway. While the first developer (using CakePHP) coded for 30 minutes on the big screen, I built the app in my chair from the back of the auditorium, starting with a clean Drupal 6 installation, recording my screen. Below is that recording, with narration added afterwards. (Glance at the app wireframes first to understand the task.)

Worth noting:

  • I used Drupal 6 because I know it best; if this were a production app, I would be using the newly released Drupal 7.
  • I start, as you can see, with an empty directory on a Linux server and an Apache virtualhost already defined.
  • I build a small custom module at the end just to show that code is obviously involved at anything beyond the basic level, but most of the setup is done in the UI.


One irony of the framework-vs-CMS argument is that what makes these frameworks appealing is precisely the automated helpers - be it scaffolding in Symfony, baking in CakePHP, raking in Rails, etc - that all reduce the need for wheel-reinventing manual coding. After the tools do their thing, the frameworks require code, and Drupal requires (at the basic level) visual component building (followed, of course, by code as the app gets more custom/complex). Why is one approach more "framework"-y or app-y than the other? If I build a complex app in Drupal, and my time spent writing custom code outweighs the UI work (as it usually does), does that change the nature of the framework?

Where the CMS nature of Drupal hits a wall in my view is in building apps that aren't compatible with Drupal's basic assumptions. It assumes the basic unit - a piece of "content" called a "node" - should have a title, body, author, and date, for example. If that most basic schema doesn't fit what you're trying to build, then you probably don't want to use Drupal. But for many apps, it fits well enough, so Drupal deserves a spot on the list of application frameworks, to be weighed for its pros and cons on each project just like the rest.

Oct 27 '10 3:43pm
Tags

PHP foreach/reference oddities

I encountered this weirdness today with some basic PHP:

Early in the script I had: (note the reference &$nid)

// validate
foreach($nids as $key => &$nid) {
  if (empty($nid)) unset($nids[$key]);
  if (! is_numeric($nid)) unset($nids[$key]);
  $nid = (int) $nid;
}
 
print_r($nids);

Outputs:

Array
(
[0] => 81
[1] => 1199
)

Then later on...

  foreach($nids as $nid) {
    echo $nid;
  }

Outputs:
81
81

If I change the 2nd loop to:

  foreach($nids as &$nid) {
    echo $nid;
  }

then it outputs 81 and 1199 like it should.

Shouldn't as $nid reset the variable so it's no longer a reference?

Jun 27 '10 11:05pm
Tags

Debugging PHP with XAMPP, MacGDPp, and Textmate

Technosophos has a great tutorial on setting up a PHP debugging environment with XDebug and MAMP. I'm using XAMPP and it works the same way, just change the path where xdebug.so goes.

However, the Textmate part - using the xdebug.file_link_format parameter - doesn't seem to be working. Apparently others are having the same problem, possibly Snow Leopard-related, not sure if there's a solution. It's not necessary for the debugger to work, however, just a convenient way to view the error-causing code.

Feb 3 '10 10:38am

HipHop PHP Engine by Facebook

Facebook formally presented its HipHop project last night. (Video below.) PHP is written in C but interpreted and runtime, and trades code simplicity for performance. So HipHop aims to convert PHP into optimized C++ "just in time" (which I think is the same as runtime), then compile that C++ and run it much faster than PHP would otherwise run. They've been running it live for six months and claim it uses 50% less CPU than the standard engine with equal traffic, and 30% less CPU with twice the traffic (compared to the Zend engine with APC opcode cache).

Most of the "magical" features supported in PHP (but not in C++) were preserved, but eval(), which allows arbitrary code to be run in the script, was removed. This means Drupal can't use HipHop, for one thing.

The optimization potential depends on "how much of your code looks like C++?" Flexible variable types, for instance, run slower than type-cast variables, so HipHop has an "inference engine" to convert to C++ variable types, gaining performance for clear types but not so much when using "variant" types.

HipHop also uses its own HTTP server, so no Apache support (yet). Tabini notes, "Of course, this doesn’t preclude you from running one or more HipHop projects against separate ports on the same machine and then use Apache (or Squid, or any other server) to reverse proxy to them."

It'll all be open source, of course: the project home is here, and code will be on GitHub "soon."

Update: Four Kitchens ponders ways Drupal could be modified to support HipHop. (The changes suggested there should probably be done regardless.) I look forward to seeing that in their Pressflow distribution.

Dec 23 '09 2:53pm

Path Traversal is frighteningly simple

This StackOverflow question about path traversal prompted me to see how easy it is.

All it takes is a PHP file like this on your server:

<?php
// explore path traversal vulnerabilities
ini_set('display_errors', 'on');
ini_set('error_reporting', E_ALL);
 
$path = isset($_GET['path']) ? $_GET['path'] : '';
 
  if (empty($path)) {
    echo "No path.";
    die;
  }
 
echo $path . '<br/>' . realpath($path) . '<hr/>';
 
if (is_dir($path)) {
  echo '<pre>' . print_r(scandir($path),true) . '</pre>';
}
else {
  $file = file_get_contents($path);
  echo htmlspecialchars($file);  
}

... and someone can gain total read access to your file system. Run that script with ?path=../../etc/passwd, for example, and the system's user list is printed straight to the screen. (Because most Unix systems set --4 [all-read] permissions by default on system files.) (So DO NOT put that code on your server!

Of course, that exact code would never be used, but there are all kinds of other scenarios where user-submitted parameters or cookies are passed through to the file system. That's one of the advantages of working in a framework (vs coding an app from scratch) - all these considerations have (presumably) been taken into account, and the API (if used correctly) should handle it. But it just reminds me how critical it is to escape all characters, never pass through form values directly, never load files based on unfiltered user input, etc etc... Apache's access directives are useless once the script is running server-side.

Nov 13 '09 3:09pm
Tags

Old-school SSI on PHP5/Apache2.2

I had to set up a local copy of an old site running (on the server) PHP 4 and hundreds of old-school SSIs (server-side includes), of the <!--#include variety. It took a bunch of time to get it right, but in the end it's pretty simple:

      Make sure mod_include is enabled in your httpd.conf.
      Have this directive in your httpd.conf (it should be possible in the virtualhost or .htaccess too but doesn't work there:
      <Directory />
      Options Includes
      </Directory>
      (It's possible the Directory can be other than root (/) -- I think the key is just to put it in httpd.conf.)
      Also in httpd.conf, put the directive SetOutputFilter INCLUDES.

Prior to PHP 4.6 I think, there was a configuration directive --enable-track-vars which allowed SSI variables to work PHP, or something like that, but it's deprecated and built in now, so PHP can output SSI directives.

Make sure to restart Apache after adding this, of course.

If I run into other issues with this, I'll update this post; in the meantime, it seems to be working.