Stay away from BT Infinity

bt-infinty

Like all technical people, I like to have the best internet connection I can get. For years I was with BE Broadband here in the UK, the default go to for a technical audience with great support staff and when they launched, they had the fastest speeds available. As years went on though, and they continually teased their customers that they were conducting fiber trials – no one really saw the service. Eventually, they were sold to Sky Broadband and the networks merged. Not wanting to be subject to the whims of Rupert Murdoch’s empire on my primary medium for communication – I bailed like tens of thousands (maybe hundreds of thousands) of others.

This left myself and many others looking for a new home. Yes, to highly technical people who like to host their own services such as XMPP, OpenVPN, their own DNS, SIP PBX, own control over their own email servers on their own connection, it really is looking for something that feels like a home. You want an ISP you can trust. Except I was one of the few who stupidly got caught up in a quest for speed. I knew I wanted more speed and the options out there were not looking good. Sky was out of the question because I didn’t want my content being controlled by a media empire. I didn’t want to go with BT because of the infamous terrible service and many of the other big services were out of the question due to price or limited features. I wanted no limits, no throttling, no blocked ports and total control. The sad thing is, I settled on BT and ended up getting the worst internet service of my life.

BT Infinity right after I signed up for them announced that they were enabling Carrier Grade NAT (CGNAT) on all connections where ten people would share the same IP address. Yes, this is as idiotic as it gets, and is what is used by mobile carriers but this is an issue that never would have been needed if they had prepared themselves for IPv6 early on. We all knew that we’d be running out of IP addresses between 2013 and 2015 and we have. Thankfully there is a way to opt-out of this, but there isn’t a way to properly opt out of having specific ports blocked by BT – nor is there a way to get a static IP address. Thankfully, I’m using a Dynamic DNS system that hasn’t let me down so I can maintain at least some sort of remote access to my network via OpenVPN. The biggest hit to me was not being able to host my own mail server anymore. It was working fine one day then out of nowhere in 2014, BT suddenly started to block it. First thing I did was get on the phone to them, explain what I was doing, why I needed it and they very bluntly said that it’s not a feature of that service and I should look at upgrading to their business service. Something that I wasn’t going to do since I’ve experience horrible downtime thanks to their bad network and their absolutely terrible BT HomeHub 5, which I personally believe should be consigned to recycling plants all over the UK so their components can be made into something that is actually useful.

So now, once again, I’m in the position of looking for a new ISP. I would like to get a gigabit fiber connection, but sadly in my part of Glasgow in Scotland, I’m stuck to the whims of the BT Openreach and LLU network. Virgin Media aren’t even in my area. So instead, I’m going to go with the place I should have went with in the first place. Zen Internet. Zen is the real powerusers ISP left in the UK that actually is affordable, totally unlimited, fast and offers you a free static IPv4 address and they have stated that a full IPv6 deployment is coming within 12-18 months. This is news to my ears. As soon as my contract is up, you will have yet another gleefully happy customer who will feel right at home. I urge other people to check them out who are currently in the same situation I am.

Time to rethink the cloud

cloud-servers1

Please forgive me dear readers as this is a article that needs to be made, and has been a few times around the internet. One of the best examples out there is by Jason Scott entitled, “FUCK THE CLOUD”, and I honestly couldn’t have said it better myself. In fact, you should check out Jason’s other articles linked at the bottom of the one that is linked, especially the one about magnolia where he makes pretty much the same points that I do. In this article, I’m not merely going to rant about “the cloud” as an abstraction, but rather that the suits and marketing folks took a simple abstraction that was traditionally represented upon a whiteboard to mean an external network that you don’t control outside your own network, make it seem fluffy and in turn make you cede all control over your infrastructure. Instead, I’m going to compare the very real and two vastly different paradigms that’s really on offer here when it comes to your infrastructure. Infrastructure as a service (the cloud), and infrastructure as code. Yes, both can compliment each other quite well. However, over reliance on infrastructure as code brings great benefits and value – the cloud brings added benefits, added value and greater risk. Allow me to explain in the most Penn and Teller way I can.

Infrastructure As A Service (Sadly known as “The Cloud”)

IAaS is actually a very useful tool that one can use when needed such as when you’re experiencing a greater than expected load on your systems and you need some fall-back to handle the extra load. That is the perfect scenario of where this type of stuff comes in so very handy. There are of course the other applications of IAaS such as large scale data storage if bandwidth is an issue for your application. Perfect examples of this would be something that serves up large media files where offerings such as Content Delivery Networks (CDN) are the ideal solution if you don’t want to, or can’t roll your own. Anyone who has utilized Akamai or S3 knows of its benefits. It might seem reasonable to serve from “the cloud”, and it is, but the problem I have is that companies all over the world store everything they have on other outside third party infrastructure such as Amazon Web Services. People, this is bad! The marketing folks at Amazon, and the fanboys won’t tell you this, but this is not what you’re supposed to do! Any self-respecting network or system architect would never offer up the entire crown jewels of their company and place their faith in someone like Amazon or Microsoft. Re-read that, faith. You’re putting your faith into a company to store all your services, your data, your customers data and probably even your backups. Faith is not a backup strategy. It’s a lousy, lazy and down right ugly idea that most of these so-called “experts” have gotten it into their head that it’s the best solution to all their problems. It isn’t. It’s a bloody nightmare with additional overhead that cedes all sorts of control over your infrastructure to the point where it truly is not worth it.

Infrastructure as a service is a huge business risk. When done incorrectly.

Security

To understand why it is a business risk, we must re-learn the value of our companies intellectual property. If we truly value it and do not want it stolen or compromised through poor security then we have to take that power into our own hands. Yes, it is not easy, but if you actually value these things then you’ll hire people who actually know how to properly administer, secure and deploy systems at all levels. Most companies now do not do that, they’re ceding far too much of their control and with it, their security to firms like Amazon. Again, you are putting your faith into Amazon or worse Microsoft, that they have this “security thing” figured out and it won’t be a problem. You think that you just have to worry about the security of your own image or your own application? News flash. Most companies when they deploy an image, rarely if at all do any kind of security hardening on it outside of the basics they get when they create it through the AWS wizard. There is a reason that traditionally we have expressed external infrastructure outside of our control as a cloud – it is nebulous in all its parts. This is not an added benefit, this is a risk. How are we to accurately security audit the full stack of such a nebulous system?

You cannot. Believing you can is bullshit.

Cloud "security"

Cloud “security”

Most system operators/administrators or “DevOps” as they sometimes incorrectly like to be called these days will not be able to tell you about the security of their infrastructure if it is hosted in the cloud. If your job is to push buttons on a website to administer the operations of your stack, or simply edit configuration files then you’re not allowed to call yourself a “DevOps guy” and need to had in your DevOps card – and probably your man card with it. It’s sad that through the creation and growth of what is unquestionably a fantastic technology, virtualization, has made traditional (and the up and coming) system operators dumber and more ignorant than a bag of rocks. Most administrators, or “cloud babysitters”, will not be able to tell you if their images and the software contained within them have full RELRO, if stack canaries are inside their binaries or if they are position independent executables (PIE). Ask any typical administrator what they are and they’ll draw a blank. RELRO is Relocation Read-Only is a simple exploit mitigation technique employed during the build of your binaries (if the options are set) that can be effective in mitigating return-orientated exploitation. Stack canaries are simple protections for the stack whereby a random integer is pushed onto the stack just after the function return pointer has been pushed on. Then the canary value is actually checked before the function returns. This is a major annoyance if you’re an attacker and trying to smash the stack. Whilst this isn’t the be-all and end-all of exploit mitigation, it does make an attacker work much harder and is generally an effective pain in the ass though it isn’t perfect. Explaining all of these security measures in your binaries is well outside the scope of this article. Thankfully, with Fedora 23, all packages are going to be rebuilt with all of these security measures enabled. For the most sensitive of services, these are already enabled and have been for a while – even across many distributions. Thankfully though, it’s easy to enable this yourself as rebuilding a Fedora package is insanely easy and there are a few I have enabled myself for my own use. I wonder how many system administrators or cloud babysitters do the same thing these days? Probably not at all. Most of them barely understand how the systems truly work. You’re supposed to be the master of your domain, the king of your kingdom. Instead, system administration has become the twenty-first century equivalent of paper pushing.

I’ve seen, interviewed with and contracted with companies who have their entire infrastructure on the cloud. Including their backups. What if everything you had was compromised? What if your keys to your AWS instances were stolen (I’ve seen it happen) and with it, they took your backups? In the last several years, the cloud has been seen as an attack vector against companies. You see it all the time now.

That cloud is actually a fart (CAPEX vs OPEX)

If you buy in to the typical drivel that is peddled by the suits at Amazon, Microsoft of SAP then you belong to some of the most gullible and idiotic group of people that is shared by flat-earthers, Scientology and viewers of the Fox News Channel. I’ve seen the same marketing material trumped out by all three companies to try and sell the gullible CIO/CTO/CEO to buy into this new fangled buzzword. The first place they go to is the capital expenditure (CAPEX) versus the operational expenditure (OPEX) argument and try and convince people that the cost of running all their services on the cloud and giving them a big cheque every month will be cheaper than the up-front investment in servers and services. Yes, there’s no up-front cost for buying physical hardware that you need, that’s true. However that’s not typically really the case in the long term, and as I’ve pointed out above there are great IT security and business risks involved with having a total buy in to the cloud which have to be taken into consideration for this type of decision. With the increase of competition in this space, I’m sure the prices will be reasonable for most people to consider this, even start-ups. It’s a compromise. These services should always be involved in infrastructure planning and high load contingencies, ready to complement the rest of your services when load spikes. Not replace them.

If you buy some of my cloud, I can buy a Jaguar F-Type

If you buy some of my cloud, I can buy a Jaguar F-Type

The debate should not be about CAPEX vs OPEX, rather it should be about operational and business value. If your company owns its own hardware, that is an asset that’s added on to the company book value. Compared to all the other assets in your company, you might think that a bunch of servers doesn’t really add all that much value. Sure, like all other assets there is a depreciation factor associated with them. That’s less of a problem as it was ten years ago as the hardware inside servers will be more useful for longer than previous generations of hardware was. Regardless of what dollar value the bean-counters associate with them. But if they don’t have much value “in the grand scheme of things”, then how do you value your intellectual property? Would you store all your engineering documents, designs and schematics in a desk drawer of another company? Would you first have that other company look through all your patent documents between your legal team, your engineers and the patent office as they’re mailed between one and another? That’s what you’re doing if you offload all your email to Microsoft and Google. A lot of my infrastructure contracts over the last few years has been to design, deploy and administer secure email systems because businesses, especially engineering and scientific companies do not trust such valuable information to be hosted on other servers.

The other value you get is that with owning these systems yourselves, you will also have a more knowledgeable, responsible and smarter team of people within your company who knows how to do system operations properly. By offloading all your technology to the cloud, the only planning you truly have is faith in those systems. I don’t have faith in any system, that’s why I design and plan extensively to mitigate any scenario that might arise. If you have an operations team where their answer is to always automatically say, “we’ll do it on the cloud”, you need a new operations team. Fire them and get people who are actually skilled and take pride in their craft. Hire actual engineers to do this, these days – you have to. You need people with experience writing code. Recently, I interviewed with a company for a DevOps position and rejected me because I was too, “development focused” and I asked too many questions so they thought I was unable to take direction. At first I was upset because it was somewhere that I truly wanted to work, now I’m still laughing about it at how ridiculous it sounds. It seems as if they don’t actually know what DevOps is. If you hire good people, the value it will add to your business will be paying off every single day with every single transaction. Even if it’s something small such as starting out with bringing your email, DNS and corporate website in house.

The cloud is not always the answer to scalability

If you will indulge me to channel into my inner Penn and Teller, this one really pisses me off. The fact that organisations view the cloud as the only answer to scalability. It isn’t. Usually it’s sheer laziness at work where they attempt to through more hardware, more instances at a problem. People these days have forgotten how to optimize. They’re scared to look at creative or “non traditional” approaches. This is especially prevalent within web companies who make heavy use of Django or Rails and are too scared to try to increase throughput throughout their application servers, instead resorting to, “fire up another instance”. Of all my DevOps friends who work in those environments, all they do is have a single stack set up per instance and don’t even fully utilize the CPU cores. Their solution rather than to scale up on that box first and utilize multiple instances per box is just to scale out. It’s a waste of resources.

Yes, the cloud can greatly aid you in your need to scale, but you have to be doing things right in the first place. As per the previous examples that I have cited, the cloud is perfect for scaling media serving or even the size of your data if you so desire or have the need.

Every cloud has a silver lining

If there’s anything that I’m trying to sell you on, it’s responsibility and accountability. The cloud definitely has a place is the world and provides a service that can add value to your organisation, if used responsibly. As I have already stated, it’s perfect for excess load, distribution and so forth, but the responsibility to your organisation and infrastructure is yours. Properly designing systems is not hard. In fact, it’s quite easy if you have the skills and because you actually have control.

So what is the correct paradigm to look at your infrastructure?

Infrastructure As Code

The reason that many people have been attracted to the cloud in the last few years is because they falsely assume that their own server farms are too hard to manage or maintain. Bullshit. These days, we have fantastic tools available to us, but the absolute best and champion of “infrastructure as code” is Puppet. The guys over at PuppetLabs have unquestionably made everyone’s lives so much easier. Not just in the cloud environment, but also for in house server farms. Being able to automate and easily manage your infrastructure is truly a blessing. The entire concept of infrastructure as code is something that we’ve been waiting for a very long time. It’s actually somewhat disappointing that all of this wasn’t already mature for ten years within the industry at the time the guys at Puppet showed up. It was the ultimate missing link for administration and if you’ve ever used it, you’ll know why. It’s impossible to go back to the old days before puppet. Not only has it eased the management and deployment of production environments, but when combined with tools such as Vagrant it then becomes invaluable within your own development environment. Infrastructure as code that is versioned within git repositories, combined with virtualization is the holy grail of system operation and deployment in the twenty-first century. This is a pattern that will live on for decades.

One of the primary reasons that people started to switch to the cloud in the first place was to ease the burden of setup. I can fully remember the (now agonizing) days of when each and every server or VM image had to be setup manually. I also remember trying to maintain massive bash or Perl scripts just to set up the basics of the system or your software upon it. It was a nightmare. Now, the friction to having your own hardware is much lower than it used to be. Most of the problems are gone, the only one that really remains is the hardware costs which are easily managed and mitigated by using the cloud only for when you’re under peak loads. If your business truly does require you to invest in more hardware servers due to a stable growth trend, then do it. Doing so is a true investment that will pay off, and having the right team in place to manage it will add untold amounts of value to your operations. There is absolutely no reason why a company should have single point of reliance on a sole vendor such as Amazon or Microsoft to host their services, likewise, you shouldn’t keep all your hardware servers located in one place. All the new DevOps tools now allow you to easily manage a multiple site setup with ease which will then mitigate your reliance on a single co-location vendor. It’s painlessly easy to migrate running VM guests to other VM hosts to another side of the world from your laptop, I’ve done it. The annoying wait process for additional hardware also doesn’t seem to be a problem either these days. Most datacenters will have hardware available for you to put in, or you can have new hardware delivered over-night to your co-location facility to be installed. If your business plans ahead properly and your operations team is part of those plans, with metrics in hand, then the previous headache where you had to wait for the next hardware cycle is fairly averted. This is not hard to do.

Leverage the power of Infrastructure as Code as much as you can. I plan to be going more in depth in future articles about DevOps, Puppet and yes, even cloud infrastructure as much as possible. The intent for all categories is to showcase their power and what they’re best used for to increase adoption of the right tool for the right job. In essense that is my main problem with the over-enthusiastic push towards putting absolutely everything on the cloud. It is not the right tool for the job all of the time. You wouldn’t hire a steel post pile driver to nail a carpet down, would you?

Another article that I aim to write sometime this week is about how some cloud services is actually detrimental to your privacy and in part, that is also what some of this article is about. Whether for personal use or for business, the cloud involves you having to say, “Yes, I relinquish control”.

JRuby 9000 Performance

logo

This is a follow up article to the one I wrote in May 2015 titled “The Problem with Python“. In that article I talk about the challenges that faces Python (and standard Ruby) in that they’re stuck with their respective Global Interpreter Locks. In that article, I go on to talk about how Python doesn’t really have a solid plan in place to either remove the GIL or bring about a solution that would give you true concurrency and parallelism. In short, you need to be able to use real threads. I proposed that a good starting point for the Python community would be to experiment and play with, or even migrate over to using something like IronPython now that Microsoft has released a whole host of the .NET CLR infrastructure as open source. The problem with Python though is that it has far too much dependencies on it’s C modules to make it a realistic endeavour, at least for now.

Which now brings me to Ruby. Ruby has exactly the same problem, a few years ago they moved from a “green thread” implementation to their own GIL, but are still stuck with the single line of execution problem that plagues Python. The great news is that there is a perfect solution out there and it works brilliantly. It is fully compatible with the latest version of the Ruby language specification and you have very little issues with gems. That solution is JRuby. It’s a full Java implementation of Ruby that runs on the JVM and it is truly fantastic. It gives you access to proper threading, so with it comes concurrency, parallelism and all the benefits therein. Not only that, but it does so in a way that is truly a dream to code in. It truly is beautiful and fun. Yes, those platitudes normally espoused by Ruby fanatics over almost the last decade – only they’re not empty platitudes. My first experience with Ruby was in 2011 and I have to admit it was somewhat painful – though it didn’t have to be. The problem that I had was that I was learning Rails without first learning Ruby. A more open mind from my perspective wouldn’t have gone a miss either since I was so entrenched in doing things as a typical programmer with a C++ background would. Thankfully, I saw the light and started to love the language, the ecosystem and yes, the community around it. The big drawback for me with Ruby was always the performance which prevented me from using it in truly awesome projects. Normally the type of projects I do are big, ambitious, require stupid amounts of performance to be pushed through them, so it was consigned to the part of my day-to-day that previously was taken up by Bash, Perl and Python. A few years ago, I came upon a scenario where that even started to change and that’s when I was writing a whole bunch of Metasploit modules. It was at that moment that I started to truly fall in love with Ruby and was looking for projects where I could use it. First I started off with converting a whole bunch of my old scripts from Bash, Perl and Python into Ruby where appropriate. I was even planning grandiose projects where Ruby would be perfect, but I always had the same nagging problem… performance. Or more accurately, scalability across many cores and true concurrency and parallelism. Thankfully, that is now no longer the case thanks to JRuby.

Benchmarking the rubies!

So let’s start to get into some solid numbers and see for ourselves. For this benchmark, I’m using a simple program I wrote and released on this blog called log-ipsnarf.rb. It’s simple, you give it a log file, it goes through and looks for IP addresses and then outputs them all into another file for you to do further processing. It also goes through and checks for hostnames, resolves them and gives you the ip addresses of those host. Shamefully, I could have written this better, optimized it better and cleaned it up before releasing it. Since it was hacked together rather quickly and it’s a nice little utility I thought it was nice to share. So do please forgive me. The downside to this program is that it’s horrendously slow when you’re inputting a large log file. For the example of this benchmark, the input is a 9.9MB weechat IRC log file that has 31833 IP addresses within it, all mixed up with the usual stuff you’d see in an IRC log file. The actual number varies since it counts those hosts that could be resolved. When I started writing this article this morning, there were fewer resolving than on the final runs. The nature of “real-world” benchmarking! There is no removal of duplicates of the IP addresses as one of the things I was measuring at the time was the occurrence of each IP address for further analysis.

Whilst I haven’t profiled this code yet, I do suspect that a good portion of the performance problems might also be attributed to the resolving of the hostnames over the network. This is completely fine, but for me, this is also a real world scenario and as such, I’m not going to alter the purpose of this program just to make a benchmark appear better. I suspect that this might be a good area where I could also squeeze some performance benefits out later on.

The system that is being used for this benchmark is nothing impressive. It’s my daily driver that is a ThinkPad W500 with 4GB of RAM and a Intel Core2 Duo CPU T9400 running at 2.53GHz. The operating system is Fedora 22 64bit. Since I’m going to be comparing JRuby, I’m also using the latest version of Oracle’s JVM. The output from jruby -v is the following.

The system is running with everything you’d probably expect from a typical workstation including the desktop environment, email client, pidgin, vim, more terminals than I can count and LibreOffice Calc for charting the benchmark times. Since I typically run this script on my Thinkpad, I see no reason why I shouldn’t benchmark in real-world conditions as opposed to a minimalistic benchmarking host dedicated to just that fact.

I’m measuring the time that it took each run to complete by using /usr/bin/time with the following arguments:

I didn’t bother looking at CPU usage for the process or how much RAM it took because since it is a batch processing program, I’m more interested in how fast it can get done. So let’s have a look at the program again.

Github: log-ipsnarf.rb

Now lets have a look at the benchmark results comparing this program running on Ruby 1.9.3, Ruby 2.2.2 and JRuby 9.0.0.0. The times shown are minutes:seconds and whatever the value was after the period on seconds from the time command. I’m not a spreadsheet expert so I couldn’t properly get that to show in the chart. My apologies for that.

benchmark1

From this data, we clearly see that Ruby 2.2.2 and JRuby 9000 is a lot faster than Ruby 1.9.3, but the thing that stands out is that JRuby 9000 is slightly slower than 2.2.2. I think part of this has to do with the JVM spin up time, but for the purpose of this benchmark for a small utility program, there’s no point in tweaking the JVM settings so it’s kept as standard. Clearly when you have such a long run time for a program, it is badly begging to be optimized. Let’s do just that.

Optimizing the rubies!

Well clearly those numbers are completely unacceptable. The log file that I put through it for this benchmark wasn’t even the largest one I’ve ever put through it for the analytical purpose that the program was created for. So now, we have to work on optimization of the code. The rule of thumb for any programming is that before you decide to make it multi-threaded or have stuff run in parallel, you must optimize it in single thread first. That is what we’re going to do here. The following snippet is of where I refactored the code, created a queue and added threading. The program reads the file in chunks of a certain size and passes those lines from the file into the queue to be processed. I would have created a post at each step of the way, but it would have been far too long. I took inspiration for this work that Charles Nutter presented at Rubyconf India 2014. Again, please forgive me. There is no doubt a billion different ways to do this better.

As you can see from the following chart, the refactor and added performance improvements helps a lot. When running it on the normal Ruby interpreter. This is with even more ip addresses in the file (49602 for the 1.9.3 run and 49594 for the 2.2.2 run)  because the more that can resolve, the more there are in the final product.

benchmark2

Utilize all the cores!

Now it’s time to see what that stacks up like against JRuby 9000.

benchmark3As you can see, this quite impressive, especially when we compare it to the original run through of the original file.

finalEven still, I can’t help but be a little disappointed that I couldn’t get more performance out of it. Perhaps you’d like to have a go for yourself and see? I do think though that the reason the run time was so long on the final run using JRuby 2000 was because of the time it takes to resolve addresses over the wire. A lot of them would have probably timed out and of course, will always be subject to all the other various conditions that can arise when one is doing network I/O. I’ll also include a screenshot of Java VisualVM of the last run through to demonstrate.

visualvmbenchmark

Conclusions

My conclusions is the easiest part of all of this. JRuby is the future of Ruby. There is absolutely no question in my mind about it. It has far less of a migration headache to move to it than what you would if you were going from Python to IronPython. The performance benefits and the ability to go across all cores is truly what makes it appealing to me. I’m sorry that from my testing runs, I couldn’t give you a screenshot of all the cores being completely saturated with usage, Charles Nutter does a brilliant job of that already and I highly recommend that you all watch his talk I linked up above. There is also a slidedeck available for it here. As well as his JRuby: Over 9000 slidedeck here. I haven’t seen the video of this one (if there is one available) yet, but I would sure like to.

For me, JRuby has made Ruby an even more valuable tool for me. It’s the equivalent of a 18th century carpenter finally being handed an entire shop of power tools. It is opening up more and more possibilities of the type of projects I want to do with Ruby now. I sadly didn’t even get round to talking about the amazing things you can also do since you have full access to all the Java libraries and they’re easy to call from Ruby. There’s other articles out there that demonstrate that quite nicely, and perhaps, maybe I will too at some point.

Now, more than ever, I’m straining to see a use for Python for greenfield projects. They had a perfect oppertunity with Python 3 to do a completely new paradigm shift and make Python the .NET (or dare I say it, Java) of the open source world to the degree that C# is in the Microsoft world. Jeff Attwood wrote a brilliant article back in 2013 entitled, “Why Ruby?“, and laid out many eye opening points to a lot of people and wonder openly why he didn’t choose Python. Especially since he was your typical C# and .NET Microsoftie guru, now moving over to open source and decided to go with Ruby. However, the thing that has always stood out in my mind when I think about that article is the comments on it. It’s replete with Pythonistas (god I hate that word) going crazy about Ruby, especially attacking it’s performance. I’m sorry guys, but now the Rubyists can run circles around you both in coding time and run time with JRuby. Not to mention, the time it takes to get started with JRuby is very little compared to that of Python, especially on Windows for the case above where people used to that environment want to try it out. For the Pythonistas… It’s game over.

Windows 10 RTM black screen on boot – a solution

Windows-10

Granted, this might be a limited solution dependent on what hardware that one has, but it will work pretty much for any ThinkPad, or other laptops that have the Radeon HD 3650 card in them. It might work with others, I don’t know since I don’t have access to the hardware, but please let me know your experience.

Let’s look at this bug, which I believe is in the graphics drivers and I’ve already reported it to Microsoft. You boot up your machine and you have not only a black screen, but the display seems to be turned off. For many folks, this is frustrating as hell, and I’m sure with time the drivers will improve, but this is one of those things that really should have been picked up by QA. Here is the solution that works for me. It’s ugly, but it works.

  • Close the laptop lid and wait for it to suspend,
  • Re-open the lid to wake it, observe that the screen is still black but powered on,
  • Suspend the machine again by pressing the power button,
  • Reawaken the machine by pressing the power button.

After this, you should be presented with the login screen for Windows 10. This has worked for me every time over the last 12 hours, and yes, it’s annoying as hell. Hopefully a fix is pushed out soon. It seems that a bunch of drivers are still lacking in Windows 10. For example, I can’t get my built in SD card reader to work as Windows 10 can’t find any drivers for it.

Update: Thanks to marvinniederhaus for the following information!

It seems that with some laptops with nvidia cards, all you will have to do is cycle the power button and then that will bring the display on, without having to close and open the lid. I’d advise that most people try this first as my scenario, it might just be an edge case, or might apply to that entire generation of AMD cards.

The problem with Python

python

For this post, I want to talk about a significant failure within the Python community. The GIL (Global Interpreter Lock). Yes, I can hear the sighs already along with the thought inside your head dear reader that says, “another GIL complaining post”. This post won’t be a complaint primarily about the GIL, but rather the failure of the Python community at large to accept and adopt significant alternative implementations to CPython or adopt another way of looking at a problem. In my projects, when I’m starting on something and I have a slight nagging feeling that a design choice doesn’t quite feel right, it’s usually because it isn’t. In the past when I have ignored that feeling I have ended up sticking to my guns on a choice only to finally accept a couple of hundred/thousand lines of code later that it was actually a really bad idea. Mostly this is a feeling I get when I’m working on something in C++. These days I’m more seasoned and spend more time thinking about a design before writing it up. I feel that the GIL has been something that folks inside the CPython community have had a nagging feeling about for two decades and just ignored it and drove forward with the assumption that, “it’s OK, we can just work around the GIL.” Yes, there is the multiprocessing module to work around it, but it is still that, just a workaround. There’s been many times when I’ve used multiprocessing, even in my Bitcoin trading applications but it’s still not just good enough. In 2015, having the core of Python, I’d say, “broken” so badly is inexcusable. When Python 3 was being developed, (who remembers it being called Python3000/Py3k?), it was already signalled that it was going to be a breaking change from the 2.x branch. We already knew that a whole bunch of stuff was going to be broken and a lot of effort was going to be required in moving over a lot of modules, so then why then was the CPython community not courageous enough to sit down and future-proof Python in a many-core reality that we live in? This seemed like it would have been the perfect opportunity to do so, but instead, they missed the bus. As a Python developer for many years, I was hugely disappointed by this, as were many others. The only real changes that has happened to the GIL in the last 20 years has been a tweak in Python 3.2, that depending on your workload and OS, gives varied results. In essence, it’s still a single track of execution. CPython’s implementation, I believe, is holding back Python from adoption in greenfield projects where it is wanted and highly desired and would be if it wasn’t for the GIL.

Python without the GIL?

It’s been done before. Jython and IronPython both have GIL-less implementations of Python running native threads on the JVM and CLR respectfully. Going forward, I believe IronPython is our Obi-Wan Kenobi, our only hope in a many-core world. Why shouldn’t it be? Microsoft’s .NET is now open sourced under the MIT/Apache licenses and with it comes the CoreCLR/DLR/Roslyn infrastructure. You can build the CLR on Linux right now if you want to. For all of you with massive codebases of 2.7 code, why not have a little research group inside your company that explores deploying your code on the CLR rather than upgrading to Python 3.4+? You’ll end up getting a whole lot more bang for your buck if you do. Now granted, not all of your CPython modules are going to be available to you in IronPython, but work has been done like at the IronClad project to make your Python C extension modules work. I believe for the 2.6 branch they had NumPy working and might have again for 2.7, you’ll need to research for yourself. If the Python community actually started to band together around a better Python implementation than CPython then they’ll be better set up for a future. If you want to stay with the GIL for the next 15 years before something finally gets done about it in CPython and another massive breaking change is made, then good luck to you. You have that choice and that is the beautiful thing about open source software, choice.

Why do I care?

Personally, I completely adore Python. I love it’s syntax, it’s a true pleasure to sit down and write code in and actually get some work done. I care because I use it and I want to create great projects with the language. I don’t however like that I’m constrained by a design decision that is now decades old. When something doesn’t work in open source software, change it, fork it or reimplement it and do it right. Yes, it’s brutal and can take years, but you have to ask yourself the question, will it be worth it for my users, my community, my code and my company ten years down the road then I believe the answer will be entirely obvious to everyone. The GIL must die. This is a problem that also faces Ruby and sadly, I think it’s one of the things that now starting to hold it back from a lot of applications. Yes, Ruby does really well in some fields and I believe it’s largest project is still Metasploit but when I talk to a few friends who write Ruby as their day job, they’re starting to feel really annoyed by one little aspect. The GIL. Take EVE Online for example. They’re a mostly Pythonic shop and make extensive use of Stackless Python and as a long time EVE Player who has friends work for CCP Games, I’ve heard the gripes about Python. As a player, I’ve also experienced first hand on too many occasions where Python has simply not been up to the task. Hats off to the guys at CCP though, the fact they have had to work around Python performance for 12 years in a live production environment for a MMORPG is a testament to their skill and determination. Improving Python 2 or making the incentive to upgrade to a GIL-less Python 3, might have actually been worth it for these guys. Whilst on the subject of CCP Games, they’ve released Brennivin (named after the infamous Icelandic booze that leaves you with an epic bastard of a hangover) that contains a “non-crap” threading class. Take a look at brennivin.threadutils. Thanks CCP!

For me, I have at least ten different projects in my mind, including one that I’m working on right now with my own time where I would love to use Python, but I just can’t. So instead, I’m happily punching out thousands of lines of C++11 code, but I still have that odd sigh where I say to myself, “I wish I could use Python, but I can’t.” When I see a CoreCLR and IronPython package in Debian or Fedora though… that’ll change. Perhaps I should heed my own advice and actually contribute to IronPython and package it for Debian and Fedora. I’d like to. I’d also like to have the time and have it pay my bills but it isn’t like that’s going to happen any time soon.

Keyboard obsession: How it shaped me

blackboard1024

Keyboards are a wonderful thing. For most of us now in the second decade of the new millennium, they are an interface to the rest of the world. It is how we express ourselves, make a living, start relationships, even end them if you’re of the heartless type. Programmers and writers tend to share the same fetish or obsession with our literary tools. Tom Hanks famously has a typewriter that he adores and tries to use every single day. American writer Ron Rosenbaum is famous for his devotion to his Olympia Report De Luxe typewriter which he has described as a fetish. Ron and his typewriter were even the inspiration for a character in a film starring the aforementioned Tom Hanks named “You’ve Got Mail.” Many people would debate on whether programmers are artists, I certainly know that there are people out there who write code and structure programs in a way that it’s hard not to appreciate the complex beauty. I myself have even been known to bang out a masterpiece or two from time to time but usually I attribute that to alcohol and the Bob Dylan discography. Keyboards are much more to us than just the devices you use to “type stuff” with. Like any artist with any of their tools, we bond with it, get jealous and defensive of them and form extremely strong opinions on which is best.

Back in January 2014, I managed to secure something that I’ve wanted to get my hands on for a while. A Black IBM Model M Keyboard with the trackpoint nipple and mouse buttons below the space bar from a friend. You probably think that I’m crazy and that I have probably lost my marbles, but just hold on there. It’s universally accepted amongst anyone who has any taste in computer keyboards that the absolute best ones ever made were the IBM Model M’s of the 1980’s and 1990’s. These are such desirable keyboards that they sell online for hundreds of dollars. I found out through a fellow collector that a very variant once sold for $800, still sealed in its original box. You might think I’m talking more about action figures than computer keyboards, but many people including myself lust over and value these keyboards. It isn’t just this particular type of keyboard however. There are many Model M variants and clones that are equally as good such as the Dell AT101. The AT101 is the one that I used to use mostly, and used to when I had my iMac since I had converted it from PS/2 to USB. My old classic Model M is only pulled out for special occasions and most certainly where there isn’t any food or drink near by. Most of the time it is kept within its dust wrapper that it came with when I bought it off of eBay in 2007. This surely is the sign of a time sadly long gone when people really cared about their computer hardware.

 

As I originally wrote this article, I was using the undisputed best keyboard for a laptop computer. That of the IBM ThinkPad T60P – the divine god of laptop keyboards. A keyboard that has the ability to suck in the most devout of Apple ‘chicklet’ keyboard hipster fan folk and turn them into true believers. Many people for years and years bought the IBM ThinkPad’s purely for their keyboard and robust construction. Sadly when IBM in their infinite moronic mental state sold off the consumer business to Lenovo, it didn’t take long for the legendary ThinkPad keyboard to be replaced with fad of the month. The Macbook Pro type chicklet keyboard. Last year, I was worried that my keyboard would break sometime in the future so I decided to buy two extra ThinkPad keyboards. Sadly though, due to my T60p no longer having a functioning LCD (which will be fixed in due course) I’m deprived of using what I still believe to be the perfect balance of notebook computing thanks in a large part due to its keyboard being married with that amazing 1600×1200 display in a 4:3 aspect ratio.

All these great keyboards have the same elements that make them the top of the line. Their construction is robust, long lasting and designed to be a pleasure to write on. You have to remember, back when the Model M

IBM T60P Keyboard

IBM T60P Keyboard

was being produced. Computers were selling for $5000 and above in 1986 dollars, certainly for workstation grade machines. Let’s compare that to the keyboards that come with computers today. Those $5000 in 1986 dollars would be $10,632 today due to inflation, a 112.7% rise. I was born in 1986 and during my lifetime, US dollars have lost half their value, what a horrendous world we live in. Could you truly imagine buying a computer for $10,000 today? No, most people spend around $699-$1100 for a good desktop computer. They normally come with a cheap, mass manufactured piece of junk from China that costs $3 to make, but is probably passed onto you for $20-$25. This is why the classics are the best, and forever will be. Thankfully there are companies that exist that make keyboards that are great once again, but they’re not cheap. Ones such as Das Keyboard, which is a fairly good keyboard, but the original and best is back thanks to Unicomp. My good friend Sean has one of these keyboards and they’re true to form and a real pleasure to use. Thankfully, they’re not in the hundreds of dollars range, but they’re just as good. They even make a version for Mac OS X. Many other contenders have been entering this mechanical keyboard market again since there is a real demand and appreciation for them. In particular, you have companies such as Corsair and Razer, known for their high end equipment that gamers and enthusiasts love, now making truly spectacular mechanical keyboards and added their own flair to them with key lighting available in over 16 million colours. The Corsair keyboard is one that I have my sights on next as soon as my budget allows it. Having the three variants of Cherry keys on their keyboard where each has its own tactile feel, targeted at gamers, means that these folks are not messing around.

How I came to love keyboards

For me, my love of keyboards and typing in general started at a stupidly early age. When I had sleepovers with my cousin when I was a young child I used to play on a little typewriter that she had. I was fascinated at how the mechanics of it worked, how you could hit a little button and the colour on the page would change. The first time I managed to change the ribbon on my own was a good feeling. One Christmas, my mother got me a Brother electronic typewriter that whilst writing this article I just found out that they still make it. I wanted one so bad so I could keep typing. I used it for that at first, but not all that long after getting it, I had tore it apart to see how it works and then put it back together again successfully. After that, I had inherited a computer from my cousin, which for me is genesis and when my mind really opened up. It was an Atari ST520E and I absolutely adored it. I think this is when I first fell in love with that clicky clanky sound of the keys. They were mechanical, but not obnoxious as the AT101 can sometimes be. It was a smooth keyboard but even for the small hands of a boy of nine years old, it wasn’t too big where it didn’t allow me to type fast on it. It was perfect.

Not that long after I got the computer, I acquired some extra software for it at a local market. Atari and Amiga’s were pretty popular in the UK all the way up into the mid-late nineties and software was readily available and usually never original. Some of the “’wares” that I got, or “warez” as it’s now known, included FirstWord Plus, Deluxe Paint and a couple of floppy disks entitled PureC, GFA Basic, DevPac and NEOCOM.

DevPac on the Atari ST

DevPac on the Atari ST

Imagine the scene of a little kid, sitting down with a bunch of assorted floppy disks, wrapped in rubber band that he got at a market with those kinds of labels on them. They sounded so cool and I couldn’t wait to play them! Sadly, they weren’t cool games with cool names. Rather, they were the beginning of something truly awesome.

First up was DevPac and me being completely startled and disappointed that it wasn’t a game. I inserted the other floppies labelled DevPac and surprise! Someone had written a whole series of tutorials on assembly programming. Imagine discovering that when you’re very young and becoming enthralled in it. The thing that really hooked me was the fantastic ASCII drawings within the file and on the first read through that is what kept me reading. I wanted to understand the amazing drawings. After doing my very first “all nighter” in front of a keyboard – an event which eventually becomes the norm in my life, I had successfully written my very first, second, third and even sixth programs. I couldn’t sleep that morning I was so excited. My mother wondered why I was tired and wanted to sleep late so I told her I wasn’t feeling well and just wanted to sleep in. Thank goodness it wasn’t a school night! It was as if I had discovered something great about the universe that I never thought possible. There is a feeling that people get that when you try to learn something new, it seems impossible as if you’re climbing the steepest mountain and are exhausted. Eventually something happens where it literally just ‘clicks’ in your mind. As soon as that happens you get a rush of adrenaline and excited that you’ve finally figured some great secret out that no-one else knows and you start to develop an ego. Fair enough, most people might not with the latter, but I did when I was a kid and I never failed to show off my computer skills at school. Something that would later come back to bite me. Thankfully all the floppy disks that contained editors and compilers came with tutorials or phone numbers to a BBS where you could learn more. I did just fine with the text file tutorials, and with them I was able to write my very first C programs as well as my first BASIC programs.

BASIC was the first language I used where I made an interactive program on my own. It was very simple and fun. You would start it up, it would ask for your name, age and gender and then proceed to throw various insults at you. You could retort back with your own insults or “punch the computer”, which shut it up. I didn’t realise it at the time but with that program I had just made my first game. I had continued down this path but even very early on, BASIC annoyed me. I wanted to try and write it in assembly but then I opened up the C development environment, fired through the tutorials on those disks and instantly fell in love. Sadly I can’t remember who wrote those tutorials back in the day, I would love to hear from them or even get another copy of them so I can give this individual the credit they truly deserve. I’m sure I’m not the only person who was turned into a programmer by those text files. To this day, I still haven’t seen C programming learning material that can beat what that guy wrote. Eventually I rewrote my program in C and then moved on from there, onto bigger and better things. My love for typewriters and keyboards in some way, who knows if it was small or large, has contributed to me being a software engineer and knowing the things I know. I’m greatful for it as I couldn’t imagine my life without having this knowledge. I’d feel empty without it and I feel sorry for those who can’t write code. They truly don’t know what they’re missing – including perfect keyboards.

All this time I have sampled hundreds if not thousands of different types and variants of keyboards. I will always prefer hardware mechanical keyboards for my desktop, and my ThinkPad T60P for my laptop. The only good software keyboards I have ever liked was the keyboard on the iPad 3 only because it was big enough and the BlackBerry 10 keyboard. Now we live in the world where we use small slabs of glass to write on and transmit to all over the planet. The best mobile phone software keyboard I have ever used is hands down the 2013 BlackBerry Z10. The keyboard on that sold me on that touch screen phone where I was previously a BlackBerry hardware keyboard junkie. I love that software keyboard as much as I love my T60P’s keyboard, my Atari or my Model M. There are some things in this world that are just sheer perfection and thankfully, I hoard those things. If you have any similar stories, please share them with me, I’d love to hear them and publish them here on the site with your permission.

This article was originally posted on an older form of my blog back in January 2014. It has since been updated slightly.

Parse ip address and hostname from a text / log file

retro

A good while ago, I needed a quick and easy script for parsing out IP addresses and hostnames from a log file. Specifically, I was doing research into a popular IRC channel and wanted to see what the nationality makeup of the channel was. I was logging the channel for seven days using weechat and banged out this quick and dirty little ruby script for doing the parsing on the log file where the ip addresses would be inserted into a new text file. Since I created it, I’ve been using it a lot on many different other logs and has come in very handy for doing some data analysis. Since this adheres to the unix philosophy of doing one specific job I thought I would share it with the world on github from my own Jira server. I haven’t bothered with the rest of the system this was developed with because it’s very application specific, so feel free to use it as you’d like. I’ve licensed it under the MIT license since no doubt, someone will endeavour to make it more efficient and I encourage you to. Performance wasn’t one of my design goals when creating it though. Have fun!

Github: log-ipsnarf.rb

 

 

Ruby + Sequel ORM = Awesome

Ruby

For the last few weeks I’ve been developing a research project to better increase my understanding of Ruby after hardly touching it for a little while. The purpose of the research isn’t itself to just increase my understanding and familiarity with Ruby, I’ve already worked on a few projects using it, but it is just a nice by-product of the work to refresh some skills. You could consider part of this project data mining and for that, I’m storing the data in a Postgresql database. Working with ActiveRecord wasn’t something that I wanted to even consider since I pretty much hate it after I had so many problems with it for a client a while back. I want a more fine grained control over my data without actually going back to the days of old and writing raw SQL again, for that, Sequel is perfect. I can create my models in whatever way I want and I can interact with the database in a very powerful way that is a pleasure to work with in Ruby.

However, the Sequel documentation, whilst great for most things does have a few shortcomings. I have to admit, some of this is my own failings by forgetting a few things – something fixed by a quick refresher from Google. However, I don’t think there is any excuse for their examples in the rdoc to be wrong.

Consider the following from the Sequel “opening_databases.rdoc” file.

If you then tried to use the DB object as is stated in their documentation by creating a table such as the following example, you would receive a “formal argument cannot be a constant” error when you try to run it.

I managed to get things figured out pretty quickly though as you can see from the small excerpt below from my database creation script. Removed a lot of stuff from it that isn’t required for this example.

I’m sure at some point I will make more posts about Sequel as I am truly enjoying using it for this project, and Ruby as a whole for that matter. Perhaps one day I’ll even port over some of my old Perl, Bash and Python scripts to Ruby! Right tool for the right job in my case for now.

OpenBazaar constructive critique

OpenBazaar

OpenBazaar is a P2P, anonymous encrypted trading system. You pay in Bitcoin, you receive goods. This is awesome and it is the future. This is what is going to make marketplaces like eBay and Amazon nervous within ten years. This is a gloriously disruptive technology that I cannot wait to hit the mainstream. This is the Napster/Kazaa of eCommerce. It’s also likely to be one of the largest driving forces of Bitcoin sometime in the future.

It does however, have problems. Whilst yes, it is still in the early stages and is still in beta I can’t help but recoil in disgust at it’s implementation. First lets go over the installer. I tested it on my Fedora 21 workstation and the instructions on the Github page said to run ./configure.sh. Sure, no problem, it’ll just be a standard ./configure;make;make install deal I thought. Hooo diggity, was I wrong. It first asks me for my sudo password and then starts installing packages through yum without asking for my confirmation. That’s a big no for me and left a sour taste in my mouth at the start. You don’t just go downloading and installing packages to someones machine without first asking them for confirmation or at least telling them why. Secondly, the entire system is implemented as a Tornado webserver (Python) which then opens up your local browser (after creating the PGP keypairs) at 127.0.0.1:<random port>. WHAT?! Ok, so let me get this straight. This fancy new system that’s all about security, privacy, being anonymous and independent from any central power decides the best way to implement this is in a bloody local web server? This is a bad idea for many reasons.

First we have to consider, plain and simple, the security considerations here. When you’re doing all of this stuff through a web browser, presumably Chrome or Firefox then you are opening up an entire realm of security problems that you needed have to. Browser exploits are ugly and it could potentially de-anonymize your users or steal their credentials. Especially when you show the Bitcoin private key you generate in plain text in the settings page. Then of course there is the possibilities that one has to consider when it comes to malware, criminal organisations or other advanced persistant threats such as nation states. A little piece of javascript malware injected into this would be comedy gold on Reddit.

Yes, this can be used through Tor, but you shouldn’t be forced to setup Onioncat or Proxychains, nor does it protect you from vulnerabilities to your own system. Especailly when Windows users start using this.

I’ve only spent 10 minutes playing with it thus far, but the fact that you guys have not had a shred of security auditing or the eyes of a pentester on this at all is blindingly obvious – and dangerous. Yes, I understand, it’s still early days but I think you guys have gone about it completely the wrong way from the start. It’s a good reference implementation but how are people supposed to use this on their mobile devices? I hope that other clients are made for the network as this one is horrible. If you wanted to bootstrap the network, it should have been with a native client and completely solid code for the reference implementation. If BitTorrent and Bitcoin had been distributed like this, I don’t think they’d be at the point they are today. Food for thought…