What do you want to filter our blog on?

  • Wednesday Wave – Issue #1

    Posted by on September 25th, 2014

    ISSUE #1- 24th September 2014
    “All the News that’s Fit to Link”

    Welcome to the inaugural edition of the Wednesday Wave, Boundary’s weekly newsletter on all that is 1-second monitoring. The ‘Wave’ alludes to the sine wave in our much admired logo.

    Latest Product
    Announcing
    • Boundary’s Office Hours each Wednesday. Live 30 minute tech training/troubleshooting workshop. Join at your convenience.
    Events

    Show more...

  • Wednesday Wave – Issue #1

    Posted by on September 25th, 2014

    ISSUE #1- 24th September 2014
    “All the News that’s Fit to Link”

    Welcome to the inaugural edition of the Wednesday Wave, Boundary’s weekly newsletter on all that is 1-second monitoring. The ‘Wave’ alludes to the sine wave in our much admired logo.

    Latest Product
    Announcing
    • Boundary’s Office Hours each Wednesday. Live 30 minute tech training/troubleshooting workshop. Join at your convenience.
    Events
    • @Stripefour“New kid @boundary could well give @newrelic a run for their money – pricing is superb guys!!!! free.boundary.com check it out :-)”
    • @davidleefox: “#monitoring literally cannot get easier. I just clicked a promoted @boundary tweet, signed up, ran a script, and POOF, monitoring…”

     

    Have a great week!
    The Boundary Team

    Tweet this!
    Share on LinkedIn
    Send to a friend
  • 6 ways to lose a customer before they are a customer (and 1 way not to)

    Posted by on September 23rd, 2014

    Over the last couple of months at Boundary, we’ve been evaluating several different SaaS solutions to help us run our business and serve our customers better. As part of this process, I’m always interested to see how other companies manage the evaluation/buying process – what can we learn in terms of best/worst practices?

    Today’s world is very much about instant gratification. The freemium model or instant-on trials have become the norm for many areas and as such, the users (in this case our) expectations have changed.

    Here are some things we didn’t like:

      The Pre-Qual Rep

    I hate speaking to a “pre-qualification” sales rep. I’m fine speaking to someone in sales, they’ve got a job to do and at some point you want us to buy something but, as was the case with one company, please don’t make me speak to someone who is clearly just reading through a list of questions to “qualify me”. I don’t mind being qualified – but make it a conversation. Give me some value back for the information I am giving you.

      The “speak to our technical people multiple times...

    Show more...

  • 6 ways to lose a customer before they are a customer (and 1 way not to)

    Posted by on September 23rd, 2014

    Over the last couple of months at Boundary, we’ve been evaluating several different SaaS solutions to help us run our business and serve our customers better. As part of this process, I’m always interested to see how other companies manage the evaluation/buying process – what can we learn in terms of best/worst practices?

    Today’s world is very much about instant gratification. The freemium model or instant-on trials have become the norm for many areas and as such, the users (in this case our) expectations have changed.

    Here are some things we didn’t like:

      The Pre-Qual Rep

    I hate speaking to a “pre-qualification” sales rep. I’m fine speaking to someone in sales, they’ve got a job to do and at some point you want us to buy something but, as was the case with one company, please don’t make me speak to someone who is clearly just reading through a list of questions to “qualify me”. I don’t mind being qualified – but make it a conversation. Give me some value back for the information I am giving you.

      The “speak to our technical people multiple times before we will show you our product”

    One company had our technical director on the phone for 3 separate calls and they still hadn’t shown him the product or let us try it. We terminated the evaluation there.

      Death by PPnt

    An oldie but a goodie and still, remarkably, used by multiple companies. Show us product. If I cannot try it myself then at least show me it live.

      The free trial where you’ve got to enter your credit card first

    This one just happened yesterday. We had spoken to the company twice, explained our needs and they told us about the free 14 day trial and enabled us. But, step 1…enter a credit card. Sort of like the waiter insisting on your credit card before they allow you to order dinner.

    The response from the company was:

    We’ve made a conscious decision because of the work that we put in to a trial, to ask clients to have a valid credit card on file. Please let me know if you change your mind and we’ll be happy to get you into our trial process.

    Really? Why? Because you’re worried I will run off after eating your food and not paying? Or, are you hoping that we will forget to cancel the credit card.

      The overly complicated signup form

    Yeah, love this one. You make me enter so much information on the signup form that I’m asleep by the time I’m done but then, when I actually get to go into the product itself, you make me enter pretty much all the same information again.

      The bait and switch

    This one is really disappointing. This very well known, large and leading provider of a SaaS solution presents multiple “editions” on their web site. Below each edition is the description and the button “Try for Free”. I press the button for the Unlimited Edition, and the next screen states “xxxx – Unlimited Edition complete the form and enter the details.” Once I do that I receive an email to go into the product only the find out that no matter which edition I had selected on the web, they only give me the Professional Edition for the trial.
    This is the worst experience for me – bait and switch – I tried it multiple times because I couldn’t quite believe what was happening

    And to finish this post with a positive experience.

      The “even though we’ve got problems, we’re going to show you how much we want your business”

    One vendor we were dealing with were having a number of issues with their product during the trial. But they didn’t run away, they stuck with it and gradually solved their issues. Their CEO reached out and told us how important we were and that they were working hard to solve the issues. Because of the manner in which they reacted, we gave them time and….eventually we purchased. They cared about our success and because of this we’ve become a customer that is giving positive references to others.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Host Selection Filters and more…

    Posted by on September 19th, 2014

    Host Selection Filters help manage larger number of hosts

    While we work on many major new capabilities that are coming very soon, there are number of smaller enhancements and capabilities which made their way into the product last week.  This blog post talks about new Filter control, new Alarm button and new Knowledge Base videos and articles.

    Snip20140919_61

    One of the most popular requests that we are getting from our customers is for many more ways to manage large numbers of servers and instances. Last week we pushed a new feature into the UI that allows users to quickly filter the hosts to be displayed. For example, I would like to quickly remove all my web servers (conveniently named ‘web…’) from my “default” dashboard.  So click on search button and start typing – the list is being filtered as you type – just as you would expect!

    Snip20140919_62

    Then mouse click on corresponding check-boxes to unselect those servers. The keyboard fans can use “arrow down”, “arrow up” and “space” keys to quickly move, un-select or select the servers. Note, there is no “OK” or “Apply”...

    Show more...

  • Host Selection Filters and more…

    Posted by on September 19th, 2014

    Host Selection Filters help manage larger number of hosts

    While we work on many major new capabilities that are coming very soon, there are number of smaller enhancements and capabilities which made their way into the product last week.  This blog post talks about new Filter control, new Alarm button and new Knowledge Base videos and articles.

    Snip20140919_61

    One of the most popular requests that we are getting from our customers is for many more ways to manage large numbers of servers and instances. Last week we pushed a new feature into the UI that allows users to quickly filter the hosts to be displayed. For example, I would like to quickly remove all my web servers (conveniently named ‘web…’) from my “default” dashboard.  So click on search button and start typing – the list is being filtered as you type – just as you would expect!

    Snip20140919_62

    Then mouse click on corresponding check-boxes to unselect those servers. The keyboard fans can use “arrow down”, “arrow up” and “space” keys to quickly move, un-select or select the servers. Note, there is no “OK” or “Apply” buttons – the dashboard updates itself as you go.

    Snip20140919_63

    Clicking on search icon again or hitting ‘Esc’ button will close the filter. As you can see, the dashboard’s charts and Legend bar now do not have any of filtered web servers.

    Snip20140919_64

    Look out for more and more great enhancements over the next few weeks.


    Many of our users told us that Alarms are really important to them. To make them more accessible, we added a toolbar shortcut, which will take you directly to Alarm Settings screen.

    Snip20140919_67    Snip20140919_65

    And one more thing…

    We have been quietly yet actively adding new content to the Knowledge Base, which is shared by Boundary Free and Boundary Premium offerings. It now has 12 recently added video tutorials, many new “How To” articles and also some advanced stuff, like plugins and SDKs for code instrumentation. All indexed and easily searchable. As said one of our users, “do not let slick UI fool you – there is a lot of powerful features hidden just a click away in this cool offering”.

    Snip20140919_68   Snip20140919_69

    We continue to welcome as much feedback as we can get. This helps us develop the product totally in line with our customer needs.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • The role of code in customer success measurement and improvement

    Posted by on September 18th, 2014

    acornatom

    So I finally did it. After living in the United States for 17 years, having an American wife and 3 American kids, today I became a US Citizen. The process was not too bad (unlike replacing my green card) and now I have to get up to San Francisco tomorrow to get a US passport (they cancel my green card when I get citizenship, so right now I cannot travel out of the country which is a weird feeling).

    But, continuing on the customer success theme from previous posts, and referencing Cliff’s post from earlier this week http://boundary.com/blog/2014/09/16/age-of-code, it is so true that all areas of the business have become dependent on the ability to write code (I also want to thank Cliff for writing a post that I can actually understand for a change).

    It’s not just marketing that needs the coding capability, but it is also critical for the customer success team, the product management team and yes, even the sales team.

    It’s about understanding what your customers are doing so that you can help them more, understanding how they are using your product so that you can make the...

    Show more...

  • The role of code in customer success measurement and improvement

    Posted by on September 18th, 2014

    acornatom

    So I finally did it. After living in the United States for 17 years, having an American wife and 3 American kids, today I became a US Citizen. The process was not too bad (unlike replacing my green card) and now I have to get up to San Francisco tomorrow to get a US passport (they cancel my green card when I get citizenship, so right now I cannot travel out of the country which is a weird feeling).

    But, continuing on the customer success theme from previous posts, and referencing Cliff’s post from earlier this week http://boundary.com/blog/2014/09/16/age-of-code, it is so true that all areas of the business have become dependent on the ability to write code (I also want to thank Cliff for writing a post that I can actually understand for a change).

    It’s not just marketing that needs the coding capability, but it is also critical for the customer success team, the product management team and yes, even the sales team.

    It’s about understanding what your customers are doing so that you can help them more, understanding how they are using your product so that you can make the user experience better/deliver more value, understanding which customer may want some assistance and which prefer to be left alone. Without the instrumentation and the tools, customer success becomes a guessing game. Honestly, the advantages that SaaS companies have over providers of on-premise tools, just make it very unfair.

    Two of my kids are freshmen at high school this year, and it’s somewhat surprising to me that they don’t have a programming class. I would maybe consider it as important as learning a second language for their generation but even though my son is an avid gamer, he has no interest in learning to code at all (my daughter does however).

    For me, I have not written code for a very long time. I started when I was 14, saved my money from my paper-round (approx. 4 pounds a week) to buy an Acorn Atom (anyone remember that?), taught myself basic, then after leaving school I worked for a couple of financial institutions in the UK programming in SAS, Cobol, JCL and even did a tiny bit of assembler. But, I never normally tell anyone that because then they’d expect me to still be able, and I don’t want to go there.

    By the way, the specs for the Atom (thanks Wikipedia) were:

    • CPU: MOS Technology 6502
    • Speed: 1 MHz
    • RAM: 2 kB, expandable to 12 kB
    • ROM: 8 kB, expandable to 12 kB with various Acorn and 3rd party ROMs
    • Sound: 1 channel, integral loudspeaker
    • Size: 381×241×64 mm
    • Storage: Kansas City standard audio cassette interface


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • The Power of One-Liners…Metrics!

    Posted by on September 17th, 2014

    A developer, a designer, and a data scientist walk into a bar…and then Facebook buys the bar for 3 Billion Dollars!

    I have always been a huge fan comedians who have the power of one-liners. My earliest exposure to a comedian that performs in this genre was Henny Youngman with his infamous wife jokes: “Take my wife … please.” The recently departed comedian and actor Robin Williams was also the master of the one-liner with: “Reality…what a concept” and “If it’s the Psychic Network why do they need a phone number?” to quote just a few memorable quips.

    Similar to the comedians ability to incite laughter with a single line, a devops engineer is equally adept at issuing a one-liner in Bash, or similar shell, to query a process or service to collect metrics indicative of its health. Boundary Premium and the Shell plugin can amplify the power of the one-liner metric command by providing a graphical plot of the metric over time.

     

    How does it Work?

    Boundary’s Shell plugin is a generic plugin that allows the use of any program or scripting language to produce metrics for...

    Show more...

  • The Power of One-Liners…Metrics!

    Posted by on September 17th, 2014

    A developer, a designer, and a data scientist walk into a bar…and then Facebook buys the bar for 3 Billion Dollars!

    I have always been a huge fan comedians who have the power of one-liners. My earliest exposure to a comedian that performs in this genre was Henny Youngman with his infamous wife jokes: “Take my wife … please.” The recently departed comedian and actor Robin Williams was also the master of the one-liner with: “Reality…what a concept” and “If it’s the Psychic Network why do they need a phone number?” to quote just a few memorable quips.

    Similar to the comedians ability to incite laughter with a single line, a devops engineer is equally adept at issuing a one-liner in Bash, or similar shell, to query a process or service to collect metrics indicative of its health. Boundary Premium and the Shell plugin can amplify the power of the one-liner metric command by providing a graphical plot of the metric over time.

     

    How does it Work?

    Boundary’s Shell plugin is a generic plugin that allows the use of any program or scripting language to produce metrics for the Boundary Premium product. The plugin relay expects a script or program to send metrics via standard output with the given format:

    <METRIC_NAME> <METRIC_VALUE> <METRIC_SOURCE>\n

    where:

    METRIC_NAME is a previously defined metric
    METRIC_VALUE is the current value of the metric
    METRIC_SOURCE is the source of the metric

    Here is one-liner example which outputs the current number of running processes:

    $echo "BOUNDARY_PROCESS_COUNT $(ps -e | egrep '^.*\d+' | wc -l | tr -d ' ') $(hostname)"

    which yields this output:

    BOUNDARY_PROCESS_COUNT 205 boundary-plugin-shell-demo

    We can take this one-liner and then configure the Shell plugin to periodically report and display this metric:

    bscreenconfig

    In configuration form above, we have defined our metric be collected by setting the command field with the one-liner as an argument to the bash shell using the -c command:

    bash -c "echo BOUNDARY_PROCESS_COUNT $(ps -e | egrep '^.*\d+' | wc -l | tr -d ' ') $(hostname)"

    The Poll Time field is set to 5 so that the metric command is run every 5 seconds to provide an update of our metric.

    We can now display our new metric in a dashboard as shown here:

    blog_dash_window

    Exit Stage Left

    If you want to know more about the Shell plugin and view more examples of its use, see the documentation located here.

    Okay, let’s end on a one-liner:

    Two MySQL DBAs walk in to a NoSQL bar, but they had to leave because they couldn’t find any tables!

    I’m outta here, good night!


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • The Age of Code

    Posted by on September 16th, 2014

    I’m a big fan of science fiction, particularly the type of science fiction that holds up a mirror to what we’re experiencing in our own world. One of my favorite authors is Vernor Vinge, a computer scientist and math professor. As I live in a world purportedly being eaten by software, I find his vision of the future compelling. In particular, I’m struck by the peculiarities of human culture hinted at in “A Deepness In The Sky”. In this book there exists a subluminal trading society known as the Qeng Ho. The Qeng Ho rely on computer automation for the vast majority of work they perform, but their computing capacity is a far cry from the sentient AI’s of so many other stories. Instead, almost everyone on a ship and in their society is a programmer of some specialization. There are “programmer archaeologists” who specialize in plumbing the depths of the Qeng Ho’s substantial archive, which is rumored to having roots all the way back to the Unix utilities we use today. Pham Nuwen, one of the main characters, is described as a “Programmer-at-Arms”. The Programmer-at-Arms is...

    Show more...

  • The Age of Code

    Posted by on September 16th, 2014

    I’m a big fan of science fiction, particularly the type of science fiction that holds up a mirror to what we’re experiencing in our own world. One of my favorite authors is Vernor Vinge, a computer scientist and math professor. As I live in a world purportedly being eaten by software, I find his vision of the future compelling. In particular, I’m struck by the peculiarities of human culture hinted at in “A Deepness In The Sky”. In this book there exists a subluminal trading society known as the Qeng Ho. The Qeng Ho rely on computer automation for the vast majority of work they perform, but their computing capacity is a far cry from the sentient AI’s of so many other stories. Instead, almost everyone on a ship and in their society is a programmer of some specialization. There are “programmer archaeologists” who specialize in plumbing the depths of the Qeng Ho’s substantial archive, which is rumored to having roots all the way back to the Unix utilities we use today. Pham Nuwen, one of the main characters, is described as a “Programmer-at-Arms”. The Programmer-at-Arms is responsible for automating the coordination of ship weapons systems for maximum effectiveness, going so far as to tune targeting priorities and other system parameters as the battle progresses. In “A Deepness In The Sky” Pham’s mastery of software automation proves to be his saving grace through much of the novel. His foes, the Emergents, use a technique called Focus to turn people into almost perfect organizational cogs to be assembled into a machine. Doing so comes at a great physical and mental cost to the Focused. Not to give away any spoilers, but the Emergent’s reliance on hierarchies of control and haphazard adoption of not well understood tech eventually leads to their downfall at the hands of Pham.

    We’re seeing that same theme play out in our own time. The DevOps movement, ill defined though it may be, can be characterized by a central theme: automation allows individuals to be responsible for delivery of business value in a holistic way. The removal of people removes hierarchy, organizational rigidity and political misalignment between individuals and groups. The gains in efficiency don’t just come from removing people from the payroll, they come from providing greater mastery to the individuals shipping code. The traditional role of sysadmin has morphed into the DevOps engineer whose job is to provide automation support to the rest of engineering instead of operational handoff. DevOps is just part of the story, however. We’re seeing the early stages of this transformation reaching into other areas of business and as the benefits become apparent it will only accelerate.

    The role of a growth hacker, although snickered at in some engineering quarters, reflects the new reality for the marketing arm of business: marketing is a cross cutting concern that requires engineering in order to be effective. Everything from putting tracking codes in web pages to building out sophisticated predictive analytics requires engineering effort. However, product engineering is run at 100% of capacity in virtually all organizations and getting pulled off of large product engineering projects to work on interruptive priorities for a disparate arm of the business is massively disruptive to productivity. Hence the role of growth hacker, someone who has enough command of programming fundamentals to automate their job without imposing new requirements on product engineering.

    In Vinge’s novels the characters don’t generally talk about software or code. For the most part they refer to those things as automation. I think that is very much intentional, to portray a culture that has completely internalized the power of computers and software. In our world, however, that internalization is still very much getting underway. So if you wanted a reason to learn to code, don’t just do it because IT jobs pay relatively well. You should do it because coding will soon be a basic literacy requirement for any form of knowledge work and many physical jobs as well. The power of automation is that its benefits compound over time, and a compounding competitive advantage cannot be ignored for long. As for me, well, I’m going to work on my archaeology skills.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • More RabbitMQ than Sainsburys

    Posted by on September 11th, 2014

    There is an old slang expression that I first heard while growing up in South London, “more rabbit than Sainsburys”. It was used in reference to a person and was a way of saying that they talked too much, almost incessantly and way more quantity than quality. For those of you not familiar with the expression it might help to know that Sainsbury was and still is a food retailer in the UK and that when I was a child, a lot of people ate rabbit and, in fact, rabbit was one of the cheaper cuts of meat and far more available than chicken as an example. The rabbit in the expression refers to “talk” based on the rhyming slang of “rabbit and pork”, talk.

    This week I was seriously impressed with a rabbit but not of the furry kind. It was in fact a software package called RabbitMQ that I am sure many of you know already. The reason that I was so impressed was that, despite my questionable competence in Java that I mentioned a couple of weeks ago in another blog,  I was able to create a working RabbitMQ load simulator in just a couple of days, starting from a basis of no...

    Show more...

  • More RabbitMQ than Sainsburys

    Posted by on September 11th, 2014

    There is an old slang expression that I first heard while growing up in South London, “more rabbit than Sainsburys”. It was used in reference to a person and was a way of saying that they talked too much, almost incessantly and way more quantity than quality. For those of you not familiar with the expression it might help to know that Sainsbury was and still is a food retailer in the UK and that when I was a child, a lot of people ate rabbit and, in fact, rabbit was one of the cheaper cuts of meat and far more available than chicken as an example. The rabbit in the expression refers to “talk” based on the rhyming slang of “rabbit and pork”, talk.

    This week I was seriously impressed with a rabbit but not of the furry kind. It was in fact a software package called RabbitMQ that I am sure many of you know already. The reason that I was so impressed was that, despite my questionable competence in Java that I mentioned a couple of weeks ago in another blog,  I was able to create a working RabbitMQ load simulator in just a couple of days, starting from a basis of no understanding at all of RabbitMQ and not even knowing where to find the download. Perhaps in my subconscious I still had some primal knowledge from my brief exposure to IBM MQ Series back in the 1990’s and that in some way helped me get to success quicker, but I was amazed how easy it was to install, use and understand. The documentation was accurate and even entertaining at times. I even enabled the management plug-in for RabbitMQ to watch my application in action.

    rabbitmq 27

    From the above picture, you will also see that I enjoy listening to Ultravox and run Windows, two clear signs that I am definitely not a professional developer.

    The reason that I was doing this RabbitMQ exercise was twofold:

    1. To gain an understanding of RabbitMQ as part of our plans to expand the set of plug-ins that we will be providing in Boundary Premium. Right now, RabbitMQ is in the top 5 possibly the top 3.
    2. To simulate a Java workload that used a messaging service for data ingress and that would use many threads to process those messages. I wanted to create a simulation environment to replicate “Software Bottlenecks”. I was particular pleased with the way that the application increases the message rate gradually to a peak at the mid-point of the test and then declines after the mid-point. See the smooth curve in the top left of the picture above.

    double sleepTime = distanceFromMiddle/(messages/2)*define.INTERVAL + define.INTERVALBASE;

    Anyway, my exposure to RabbitMQ was fulfilling and enjoyable. It helped me understand more clearly that the overall customer experience (mine in this case) is a major part of a product or service. I am sure that I could have tried to install any other messaging service and achieved the same result but would I have had such a good experience that I wanted to blog about it?


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Swisscom selects Boundary for Cloud Monitoring and Operations Management

    Posted by on September 11th, 2014

    Swisscom, one of the largest providers of IT services in Switzerland, has selected Boundary to provide monitoring and operations management of its future cloud infrastructure and in parallel, Swisscom Ventures has made a strategic investment in Boundary.

    In a strategic effort to offer its customers additional solutions and take advantage of its unique capabilities, the company is in the process of building a public cloud primarily based on OpenStack, as well as offering multiple new cloud-based services. Boundary, designed specifically to monitor modern, dynamic infrastructures, was selected as a core component of this new solution.

    To maintain operational awareness, complete visibility is needed across all components of the infrastructure. Boundary accomplishes this by combining metrics data collection at one-second resolution with event/log ingestion and processing. By collecting and processing hundreds of millions of metrics and events every second using a highly scalable low latency streaming technology, organizations are able to find problems that were previously...

    Show more...

  • Swisscom selects Boundary for Cloud Monitoring and Operations Management

    Posted by on September 11th, 2014

    Swisscom, one of the largest providers of IT services in Switzerland, has selected Boundary to provide monitoring and operations management of its future cloud infrastructure and in parallel, Swisscom Ventures has made a strategic investment in Boundary.

    In a strategic effort to offer its customers additional solutions and take advantage of its unique capabilities, the company is in the process of building a public cloud primarily based on OpenStack, as well as offering multiple new cloud-based services. Boundary, designed specifically to monitor modern, dynamic infrastructures, was selected as a core component of this new solution.

    To maintain operational awareness, complete visibility is needed across all components of the infrastructure. Boundary accomplishes this by combining metrics data collection at one-second resolution with event/log ingestion and processing. By collecting and processing hundreds of millions of metrics and events every second using a highly scalable low latency streaming technology, organizations are able to find problems that were previously invisible, hidden in averages and one-minute sampling.

    Boundary includes a rich set of APIs, which will enable Swisscom to easily integrate monitoring into their cloud offering. These coupled with the solution’s multi-tenant capabilities, make it an ideal solution to manage Swisscom Cloud, which will eventually host hundreds of thousands and possibly millions of server instances.

    “The dynamic nature of cloud infrastructures can provide a monitoring and management challenge,” said Torsten Boettjer, Head of Technical Strategy Cloud at Swisscom. “With Boundary, we’ll be able to provide our customers with the best possible levels of service for their applications that will be on our Cloud infrastructure.”

    Swisscom Ventures invests in innovative areas that are strategic for Swisscom with cloud-based solutions being a focused area of investment.
    “We were impressed not only with the Boundary solution but also with the management team and their innovative views of the future of Web-scale IT,” said Stefan Kuentz, Investment Director at Swisscom Ventures. “We are investing in their vision and are excited to support them on this journey.”


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Customer success and the Billy Bullshitter experience

    Posted by on September 5th, 2014

    About a week ago, I wrote a blog post about delivering high quality customer service and getting our entire organization aligned behind this goal.

    I have a few updates to share on this subject both good and errr, not so good.

    Let’s start with some dirty laundry. One of the success mantras for our team is that we should never have the same question asked more than once from our customers.

    If our product and documentation are not simple, obvious and helpful enough to enable customers to self-serve, then we are failing. If a customer asks us something, then that becomes an opportunity to improve – either the product or the docs. Remember how I talked about being proactive?

    Well, it wasn’t fully happening. In some areas we were doing really well but in others we were not. Not sure exactly why – maybe individuals didn’t feel empowered to be proactive, maybe they were worried about making a mistake, maybe just taking the easy path but whatever the reason, I consider it my failing that I have not managed to communicate and motivate the entire team to follow this path. (It was...

    Show more...

  • Customer success and the Billy Bullshitter experience

    Posted by on September 5th, 2014

    About a week ago, I wrote a blog post about delivering high quality customer service and getting our entire organization aligned behind this goal.

    I have a few updates to share on this subject both good and errr, not so good.

    Let’s start with some dirty laundry. One of the success mantras for our team is that we should never have the same question asked more than once from our customers.

    If our product and documentation are not simple, obvious and helpful enough to enable customers to self-serve, then we are failing. If a customer asks us something, then that becomes an opportunity to improve – either the product or the docs. Remember how I talked about being proactive?

    Well, it wasn’t fully happening. In some areas we were doing really well but in others we were not. Not sure exactly why – maybe individuals didn’t feel empowered to be proactive, maybe they were worried about making a mistake, maybe just taking the easy path but whatever the reason, I consider it my failing that I have not managed to communicate and motivate the entire team to follow this path. (It was also a failure of some of our reporting metrics internally where we didn’t have the full visibility needed – which we do now). Need to improve.

    Another area we tackled this week is how we treat users when they sign up for our service. We tested many other monitoring solutions and in every case, we received only automated emails. Personally I don’t mind automated emails, they are maybe a necessary evil but what I do object to is automated emails that try and look like a real person. One large, very well known application monitoring vendor is clearly expert at this….trying to make automated emails look like a real person.

    We decided to try and do better, so we tasked our customer and technical success teams to respond manually to every user that signed up. We are not sure if this is scalable because already the volume is a little overwhelming but we tried to make it as easy as possible for our team to cope by giving them several different templates that they could use as their base and then to modify them accordingly. So, our current process is:

    1. User registers on web site
    2. User gets immediate access to product (no validation needed)
    3. This creates a “lead” in salesforce that, depending on geography is assigned to a Customer Success Manager
    4. CSM looks at the lead, looks at the individual that has signed up and sends an introductory email and also introduces the technical success manager
    5. If we haven’t heard anything, the technical success manager will follow up the next day trying to provide any assistance that the user may need or get feedback.

    We actually turned off the automated “welcome” email as part of doing this….welcome email is now from a human.

    Of course, no good deed goes unpunished. A user recently challenged us publicly….even referring to us as Billy Bullshitters (I love that)….suggesting that our emails were automated because they were sent from salesforce. Yes, our CSMs sit in front of salesforce and yes they use templates to help them deal with volume but no they are not automated and they are real people typing and pushing buttons to try and give just a little more personal service.

    We’ll see how it goes, but I am hoping that we can continue this path and not resort to automated everything as so many do – but then again I could be delusional and the robots may win.

    Would love to hear from others that have tried to be “better than the pack” – and will keep updating on progress.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Rules of Thumb

    Posted by on September 4th, 2014

    Who ever said that 60% CPU was too high?

    In performance management there is something known as a “rule of thumb”. It is basically a guideline that has no real justification e.g. “as a rule of thumb, do not run your servers higher than 60% CPU busy”. People have lived by these rules for many years and rarely challenge them. Occasionally people will ask “why?” and be answered with a look of disdain as if they had asked a really stupid question.

    Truth is: many people don’t know why rules-of-thumb are valid in any circumstance. They just use them as they have no better basis for making judgments on performance.

    Recently at Boundary we had a situation where one of our own systems seemed to be creaking at the seams due to an apparent overload situation. The belief was that there was simply too much work coming in to the system and that we may need a bigger box or more boxes to handle it. The basis of that belief is in the title of this blog. When the application was starting to lag, the server CPU was hitting over 60% utilization. All other indicators were healthy – no...

    Show more...

  • Rules of Thumb

    Posted by on September 4th, 2014

    Who ever said that 60% CPU was too high?

    In performance management there is something known as a “rule of thumb”. It is basically a guideline that has no real justification e.g. “as a rule of thumb, do not run your servers higher than 60% CPU busy”. People have lived by these rules for many years and rarely challenge them. Occasionally people will ask “why?” and be answered with a look of disdain as if they had asked a really stupid question.

    Truth is: many people don’t know why rules-of-thumb are valid in any circumstance. They just use them as they have no better basis for making judgments on performance.

    Recently at Boundary we had a situation where one of our own systems seemed to be creaking at the seams due to an apparent overload situation. The belief was that there was simply too much work coming in to the system and that we may need a bigger box or more boxes to handle it. The basis of that belief is in the title of this blog. When the application was starting to lag, the server CPU was hitting over 60% utilization. All other indicators were healthy – no memory issues, no disk queuing, and load average was low as well. As the application is a time critical streaming service, perhaps there weren’t enough CPU cycles in each nanosecond to process all the work in time. As a result, some people hypothesized that it must be a CPU issue based on the general rule-of-thumb.

    To explain the 60% rule-of-thumb you can look to a branch of mathematics, known as queuing theory. It is something that I actually studied a very long time ago and understood just enough to help me project an air of expertise when I was a performance management consultant.

    The one formula that I can still quote is: “total time equals service time over one minus the utilization”. It includes the Greek alphabet letter “rho” for the utilization number and this use of a dead language symbol adds to the mystique of the formula. This formula is for a single server queuing system with “exponential arrival rate” and “exponential service times”. It turns out that a CPU with a single processing unit with a varied workload fits this model quite well. If we use 1 second as an example for the service time we can estimate how long a “transaction” would take relative to the utilization of that single CPU.

    Utilization Total Time Time in Queue
    20% 1 * (1 – .2) =  1.25 seconds 0.25 seconds
    40% 1 * (1 – .4) =  1.66 seconds 0.66 seconds
    60% 1 * (1 – .6) =  2.50 seconds 1.50 seconds
    80% 1 * (1 – .8) =  5.00 seconds 4.00 seconds

    If you draw this as a curve and fill in a few more data-points:

    rivington-rules-of-thumb-graph-600

     

    You can see that the curve has a significant acceleration phase at the 60 – 80 percent utilization range. Which is why 60% is viewed as a good rule-of-thumb for CPU utilization. But the key is that this ROT is based on “exponential arrival rate” and “exponential service time” for a single server system. In our case, we are running a high volume service that runs in Java Virtual Machines and is deployed on physical servers that have up to 40 CPUs. There is no other work on those systems. It turns out that none of the characteristics of arrival rate or service time applied in this case primarily due to the nature of the streaming application and the way that the JVM manages its own workload. As a result, the 60% CPU utilization ROT was unlikely to be valid and so we spent more time diagnosing the issue. Cliff Moon, our CTO, found the real issue (thanks Cliff!) and it had nothing to do with the CPU. In fact it had nothing to do with any hardware resource. We had a software bottleneck in the system, a simple JVM setting that had been propagated from when this particular application ran on smaller servers. By increasing the maximum number of threads that the JVM could use, the application immediately started to consume more CPU and the workload lag was history. In a sense that software bottleneck was like a trap that sprung when the workload hit a certain level. Right up to that level, everything was fine. But over that level and the lag started to appear. It was pure coincidence that it happened at 60% CPU.  In this case 60% CPU utilization was not a problem at all and, in fact, was too low!

    The final result is that we have proven that we can now run those servers at almost 100% utilization on average over long periods, and have achieved workload rates that we had never seen before. The takeaway here is that if you really want to maximize performance then you may need to challenge some long held rules-of-thumb and examine performance in a much more clinical way.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Actors, Green Threads and CSP On The JVM – No, You Can’t Have A Pony

    Posted by on September 3rd, 2014

    I really wish people would stop building actor frameworks for the JVM. I know, I’m guilty of having done this myself in the past. Invariably, these projects fall far short of their intended goals, and in my opinion the applications which adopt them end up with a worse design than if they had never incorporated them in the first place.

    Let’s take a step back, however. What the hell are actors, and why is everyone so hot and bothered by them? The actor model describes a set of axioms to be followed in order to avoid common issues with concurrent programming, and in the academic world it provides a means for the theoretical analysis of concurrent computation. Specific implementations can vary substantially in how they define actors, and in the restrictions on what actors can and cannot do, however the most basic axioms of the actor model are:

    1. All actor state is local to that actor, and cannot be accessed by another.
    2. Actors must communicate only by means of message passing. Mutable messages cannot be aliased.
    3. As a response to a message an actor can: launch new actors, mutate its...

    Show more...

  • Actors, Green Threads and CSP On The JVM – No, You Can’t Have A Pony

    Posted by on September 3rd, 2014

    I really wish people would stop building actor frameworks for the JVM. I know, I’m guilty of having done this myself in the past. Invariably, these projects fall far short of their intended goals, and in my opinion the applications which adopt them end up with a worse design than if they had never incorporated them in the first place.

    Let’s take a step back, however. What the hell are actors, and why is everyone so hot and bothered by them? The actor model describes a set of axioms to be followed in order to avoid common issues with concurrent programming, and in the academic world it provides a means for the theoretical analysis of concurrent computation. Specific implementations can vary substantially in how they define actors, and in the restrictions on what actors can and cannot do, however the most basic axioms of the actor model are:

    1. All actor state is local to that actor, and cannot be accessed by another.
    2. Actors must communicate only by means of message passing. Mutable messages cannot be aliased.
    3. As a response to a message an actor can: launch new actors, mutate its internal state, or send messages to one or more other actors.
    4. Actors may block themselves, but no actor should block the thread on which it is running.

    So what are the advantages to adopting the actor model for concurrent programming? The primary advantages center around the ergonomics of concurrency. Concurrent systems are classically very hard to reason about because there are no ordering guarantees around memory mutation beyond those which are manually enforced by the programmer. Unless a lot of care, planning and experience went into the design of the system, it inevitably becomes very difficult to tell which threads might be executing a given piece of code at a time. The bugs that crop up due to sloppiness in concurrency are notoriously difficult to resolve due to the unpredictable nature of thread scheduling. Stamping out concurrency bugs is a snipe hunt.

    By narrowing the programming model so drastically, actor systems are supposed to avoid most of the silliness encountered with poorly designed concurrency. Actors and their attendant message queues provide local ordering guarantees around delivery, and since an actor can only respond to a single message at a time you get implicit locking around all of the local state for that actor. The lightweight nature of actors also means that they can be spawned in a manner that is 1:1 with the problem domain, relieving the programmer of the need to multiplex over a thread pool.

    Actor aficionados will probably reference performance as an advantage of actor frameworks. The argument for superior performance of actors (and in particular the green thread schedulers that most actor implementations are built upon) comes down to how a server decomposes work from the client and how that work gets executed on a multi-core machine. The typical straw-man drawn up by actor activists is a message passing benchmark using entirely too many threads, run on a cruddy macbook. It’s easy to gin up some hackeneyed FUD against threads to market an actor framework. It’s much harder to prove a material advantage to adopting said framework.

    Unfortunately, actor frameworks on the JVM cannot sufficiently constrain the programming environment to avoid the concurrency pitfalls that the actor model should help you avoid. After all, within the thread you are simply writing plain old java (or scala or clojure). There’s no real way to limit what that code can do, unless it is explicitly disallowed from calling into other code or looping. Therefore, even the actor frameworks which use bytecode weaving to implement cooperative multi-tasking amongst actors cannot fully guarantee non-blocking behavior. This point bears repetition: without fundamental changes in how the JVM works, one cannot guarantee that an arbitrary piece of code will not block.

    When making engineering decisions we must always be mindful of the tradeoffs we make and why we make them. Bolt on actor systems are complex beasts. They often use bytecode weaving to alter your code, hopefully without altering its meaning. They quite often rely on Java’s fork/join framework, which is notorious for its overhead, especially when it comes to small computations, and is fantastically complicated when compared to a vanilla thread pool. Actor systems are supposed to make parallel computation dead simple, but every lightweight threading system on the JVM that I’ve seen is anything but simple.

    Lest you think that I am a hater, I genuinely like actor oriented programming. I have been an enthusiastic Erlang programmer for a number of years, and I used to get genuinely excited about the activity around adding this paradigm to Java. However, I am now convinced that without support from the platform these lightweight concurrency libraries will always be a boondoggle. I’m not the only one to make this observation, either.

    We shouldn’t be trusting vendors who are pushing manifestos, decades old tribal knowledge about thread implementations, and misleading benchmarks. We should be building the simplest possible systems to solve our problems, and measuring them to understand how to get the most out of our machines.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Microservices, or How I Learned To Stop Making Monoliths and Love Conway’s Law

    Posted by on August 27th, 2014

    After reading this post I can’t help but feel that the author has missed the point of having a microservices architecture (he misses other things as well, particularly the fact that there’s a lot more folks out there writing software than just neckbeards and hipsters). Especially considering the suggestion of service objects as a way to implement microservices. Most importantly, the reason to prefer a microservice based architecture is not for encapsulation, data locality, or rigid interfaces. Microservice architectures are embraced because of Conway’s law.

    organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations

    —M. Conway

    An important corollary of Conway’s law, in my experience, has been that teams will tend to scale about as well as the software that they create. Teams working on a monolithic codebase will inevitably begin stepping on each other’s toes, which requires more rigid software engineering processes, specialized roles such as build and release engineering, and ultimately a steep...

    Show more...

  • Microservices, or How I Learned To Stop Making Monoliths and Love Conway’s Law

    Posted by on August 27th, 2014

    After reading this post I can’t help but feel that the author has missed the point of having a microservices architecture (he misses other things as well, particularly the fact that there’s a lot more folks out there writing software than just neckbeards and hipsters). Especially considering the suggestion of service objects as a way to implement microservices. Most importantly, the reason to prefer a microservice based architecture is not for encapsulation, data locality, or rigid interfaces. Microservice architectures are embraced because of Conway’s law.

    organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations

    —M. Conway

    An important corollary of Conway’s law, in my experience, has been that teams will tend to scale about as well as the software that they create. Teams working on a monolithic codebase will inevitably begin stepping on each other’s toes, which requires more rigid software engineering processes, specialized roles such as build and release engineering, and ultimately a steep decline in the incremental productivity of each new engineer added to the team.

    The OP claims not to know of a good definition for microservices. I’d like to propose one: a microservice is any isolated network service that will only perform operations on a single type of resource. So if you have the concept of a User in your service domain, there ought to be a User microservice that can perform any of the operations required to deal with a user: new signups, password resets, etc.

    This definition jibes well with Conway’s law and the real reasons why microservices are good. By limiting services to operating on a single type of resource we tend to minimize the interaction with other components, which might be under parallel development or not even implemented yet. Every dependency must be carefully considered because it adds overhead for the implementor, not just in code but in communications. Indeed a microservice architecture resembles what your team actually is: a distributed system composed of mostly independent individuals.

    And to the claim that microservices introduce a distributed system into what was once a non-distributed environment, all I can say is this: anyone building software delivered via the web who doesn’t think they are working on a distributed system fundamentally misapprehends the nature of what they’re doing. We’re all distributed systems developers now, whether we realize it or not.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • The New Factory Floor

    Posted by on August 26th, 2014

    I get a lot of weird promoted tweets that I often make fun of, however, one I got last night made me think.

    Bv8B-EpCQAAI2qz

    So, ostensibly this is meant to entice potential job candidates. However, the only thing discussed is the set of technology in use. There’s no mention of the mission, what the company does, or who you’d be working with. This got me thinking about a number of trends I’ve seen recently in the technology world: the rise of so called “hacker schools”, the talent crunch, and the growing popularity of server side JavaScript. Suddenly, to be eminently employable at a relatively high level of income one need only attend a 2-3 month course, learn JavaScript and put together a few portfolio projects. The ability to do it all only learning one language lowers the barriers to entry further than they’ve ever been.

    These trends remind me of the first rise of manufacturing jobs in the US after the second world war. With almost zero education necessary, the average American had little difficulty securing a manufacturing job with pay and benefits that could readily support a middle class...

    Show more...

  • The New Factory Floor

    Posted by on August 26th, 2014

    I get a lot of weird promoted tweets that I often make fun of, however, one I got last night made me think.

    Bv8B-EpCQAAI2qz

    So, ostensibly this is meant to entice potential job candidates. However, the only thing discussed is the set of technology in use. There’s no mention of the mission, what the company does, or who you’d be working with. This got me thinking about a number of trends I’ve seen recently in the technology world: the rise of so called “hacker schools”, the talent crunch, and the growing popularity of server side JavaScript. Suddenly, to be eminently employable at a relatively high level of income one need only attend a 2-3 month course, learn JavaScript and put together a few portfolio projects. The ability to do it all only learning one language lowers the barriers to entry further than they’ve ever been.

    These trends remind me of the first rise of manufacturing jobs in the US after the second world war. With almost zero education necessary, the average American had little difficulty securing a manufacturing job with pay and benefits that could readily support a middle class lifestyle. There are major differences, of course, between then and now. No matter how drastic the talent shortage seems to be, it seems unlikely that we will ever see numbers as high as a 25% of all jobs going to programming, which was the peak for manufacturing.

    What will be interesting to watch, and what ultimately concerned me about that job advertisement, is how much the supply of programmers will dictate technology choice. I’ve long argued against hiring based on familiarity with any particular technology, instead favoring the domain knowledge, ability and motivation of a candidate. However, many firms may see an advantage in standardizing their technology stack around something that they can specifically recruit towards. It will also be interesting to see how long it lasts. Market imbalances like the talent crunch cannot last forever. If demand doesn’t collapse due to larger circumstances such as a bubble burst, then other firms will step in to help automate away what would normally be hired into. One could argue that most SaaS companies are already doing this, piece by piece. However, unlike the manufacturing industry roughly the same set of skills are needed to displace a job as there are to do the job.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Sometimes even the easy things can take a long time

    Posted by on August 23rd, 2014

    The other day I decided to find out how easy or how difficult it was to get a custom metric in to our newly launched Boundary Premium service. I decided to do this totally on my own without any help or guidance from others within Boundary. I also decided to do it in Java, a language that I am reasonably proficient in but certainly not proficient enough as this experience proves.

    I read the documentation and it effectively said that I should http post a simple JSON object with the metric ID and the metric value to the measurements endpoint. The metric definition is set up through the application interface and was very straight forward.

    Easy Things - Settings

    (I later found out that I could do this dynamically from my application if I wanted).

    So, I wrote some code, rewrote some code, wrote some more code, deleted a lot of code and….3 hours later, I had managed to send some metric values. Why did it take so long? Because I didn’t know how to authenticate to a web site using basic authentication. In all my previous applications that had authenticated to a web site, I had used an authenticator method...

    Show more...

  • Sometimes even the easy things can take a long time

    Posted by on August 23rd, 2014

    The other day I decided to find out how easy or how difficult it was to get a custom metric in to our newly launched Boundary Premium service. I decided to do this totally on my own without any help or guidance from others within Boundary. I also decided to do it in Java, a language that I am reasonably proficient in but certainly not proficient enough as this experience proves.

    I read the documentation and it effectively said that I should http post a simple JSON object with the metric ID and the metric value to the measurements endpoint. The metric definition is set up through the application interface and was very straight forward.

    Easy Things - Settings

    (I later found out that I could do this dynamically from my application if I wanted).

    So, I wrote some code, rewrote some code, wrote some more code, deleted a lot of code and….3 hours later, I had managed to send some metric values. Why did it take so long? Because I didn’t know how to authenticate to a web site using basic authentication. In all my previous applications that had authenticated to a web site, I had used an authenticator method in my code. As it happens, that method does work with Boundary Enterprise but it didn’t work with Boundary Premium.

    It was basically because I didn’t really know what I was doing. Luckily the internet has all the answers and eventually I found the right one. So, to save anyone else falling in to a similar three hour trap, try this:

    import org.apache.commons.codec.binary.Base64;
    
    ...
    
    String authString = "{my email address}:{my api token}";
    
    byte[] authEncBytes = Base64.encodeBase64(authString.getBytes());
    
    con.setRequestProperty("Authorization", "Basic " + authStringEnc);

    I am sure that many experienced Java programmers will be amused at my lack of basic knowledge but sometimes we all need a little helping hand. I hope that the above will help someone somewhere get their custom metrics in to Boundary Premium faster than I did.

    Having said that, it was pretty straight forward once I had got over the rookie hurdle!

    Easy Things - Dashboard

    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Green cards and metrics

    Posted by on August 20th, 2014

    Quick story….last November I got married, went on honeymoon and when returning to the US (via Puerto Rico), realized that I had left my green card at home (guess I had other things on my mind). Several hours and multiple hundreds of dollars later, I was allowed to leave immigration at PR and return home – having been told that they had to cancel my green card because they could not let me in to the country without it.

    Now, I won’t begin to discuss the preposterous nature of this situation. You have all my records on file – my iris scan, my fingerprints, my passport, my countless trips out of the country over the last 13 years (and of course, many other things that I don’t even know about). I was even a member of the Global Entry Trusted Traveller program (after going through many background checks). But no, because I did not have that little piece of plastic the whole system broke down.

    I won’t bore you with the ensuing details apart from to say that it is now 9 months later, I have been to the immigration office many times and STILL don’t have my green card. Which means that...

    Show more...

  • Green cards and metrics

    Posted by on August 20th, 2014

    Quick story….last November I got married, went on honeymoon and when returning to the US (via Puerto Rico), realized that I had left my green card at home (guess I had other things on my mind). Several hours and multiple hundreds of dollars later, I was allowed to leave immigration at PR and return home – having been told that they had to cancel my green card because they could not let me in to the country without it.

    Now, I won’t begin to discuss the preposterous nature of this situation. You have all my records on file – my iris scan, my fingerprints, my passport, my countless trips out of the country over the last 13 years (and of course, many other things that I don’t even know about). I was even a member of the Global Entry Trusted Traveller program (after going through many background checks). But no, because I did not have that little piece of plastic the whole system broke down.

    I won’t bore you with the ensuing details apart from to say that it is now 9 months later, I have been to the immigration office many times and STILL don’t have my green card. Which means that every time I travel, I have to allow for about an extra hour or two to come through immigration because they always refer me to secondary, then eventually someone looks me up on a computer (imagine that) and says “OK, you’re free to go”. My family now comes through immigration separate to me because they are fed up waiting.

    Long story but I mention this purely because I was planning to stay in SF tonight with my wife/daughter but realized that I had an appointment with immigration early tomorrow morning in San Jose so instead I am at my house on my own, writing this post.

    It may of course all be a plan by our investors because when I am home alone, I tend to work and then work some more.

    Tonight it is metrics. I happen to love numbers – my kids think I am strange because of that and the rest of my family often refers to me with strange nicknames. But…and here is my question…..

    I would like to procure a simple to use, SaaS solution for metrics collection and reporting, where it has very easy to implement connectors to common tools that we use for our business….salesforce.com, Totango, Pardot, Quickbooks, Recurly, Desk.com etc.

    I don’t want a downloadable Windows package (SiSense), I don’t want to spend a small fortune (Domo), I don’t want something where I build my own connectors (GoodData) – please can I just get something that works and wake me from my excel nightmare!

    Add a comment or email me….I’ll personally send a personal bottle of wine to someone that recommends something that I end up actually using.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Customer success is all inclusive

    Posted by on August 19th, 2014

    I will use this blog to chat about life as the CEO of a startup. It will give insights into what really goes on, which is often (always?) very different than the marketing rhetoric.

    It might not always be enlightening, might not always be the best written posts, you might think it a complete waste of words but it will be real and honest.

    One of my areas of focus right now is to ensure that everyone at Boundary is constantly thinking about how to always be improving our customer experience.

    When I discuss with others on our team, they tell me that we must create defined programs and actions for individuals to take. I know that they are correct and I’m fortunate to be working with people that can implement my ranting, but what I really want to achieve is that everybody that works at Boundary simply has this as an underlying philosophy to everything they do. I don’t want us to wait to be asked, I want everyone to be proactive….see something that can be improved? then take action.

    We want our customers to never need to speak to us; of course we love communicating with our...

    Show more...

  • Customer success is all inclusive

    Posted by on August 19th, 2014

    I will use this blog to chat about life as the CEO of a startup. It will give insights into what really goes on, which is often (always?) very different than the marketing rhetoric.

    It might not always be enlightening, might not always be the best written posts, you might think it a complete waste of words but it will be real and honest.

    One of my areas of focus right now is to ensure that everyone at Boundary is constantly thinking about how to always be improving our customer experience.

    When I discuss with others on our team, they tell me that we must create defined programs and actions for individuals to take. I know that they are correct and I’m fortunate to be working with people that can implement my ranting, but what I really want to achieve is that everybody that works at Boundary simply has this as an underlying philosophy to everything they do. I don’t want us to wait to be asked, I want everyone to be proactive….see something that can be improved? then take action.

    We want our customers to never need to speak to us; of course we love communicating with our customers and we are constantly seeking feedback (how else do you learn?) but we want our products to be completely intuitive and always provide the answers to questions that the customer needs.

    Our customer success team works from the principle that we should never be asked the same question more than once. Either the product experience should be improved to ensure the question doesn’t need to be asked or docs should be updated. “How to” questions from customers are a huge opportunity to improve.

    But, the other area that I think might come as a surprise to some, is that this is an all-inclusive philosophy. It doesn’t matter whether you work in engineering, marketing, sales, customer success, product management, finance, operations, HR or anywhere else, every single person at Boundary can impact how our customers perceive us and therefore we each have a responsibility to play our part.

    A great product followed by incorrect invoicing can leave a bad taste. Misleading content on our web site can get the relationship off on the wrong foot. A support rep that commits to “get back to you tomorrow” and then takes off for the weekend is frustrating and annoying. A customer success rep that doesn’t return your email quickly makes you feel like a low priority.

    A customer said to me once “I know you must actually be really busy, but never once have you made us feel that you have anything else to do that is more important than we are”.

    That’s how I want our customers to feel.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Free Monitoring <3

    Posted by on August 7th, 2014

    When we talk to our customers there’s a few things we’ve heard above all else over the last several months: folks wanted a free offering, they wanted host level and generic metrics, and they wanted it all to be dead simple to setup and use. We listened – which is why we’re excited by this week’s release of free server monitoring. And early feedback has been fantastic.

    And it’s only going to get better from here. Sign up today and get your 10 free servers.

    10 Servers Free = Monitoring <3


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Free Monitoring <3

    Posted by on August 7th, 2014

    When we talk to our customers there’s a few things we’ve heard above all else over the last several months: folks wanted a free offering, they wanted host level and generic metrics, and they wanted it all to be dead simple to setup and use. We listened – which is why we’re excited by this week’s release of free server monitoring. And early feedback has been fantastic.

    And it’s only going to get better from here. Sign up today and get your 10 free servers.

    10 Servers Free = Monitoring <3


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Erlang MapReduce Queries, MultiFetch and Network Latency with Riak

    Posted by on June 25th, 2014

    I know, you’re looking at your calendar, let me be the first to assure you it’s not 2011. I recently had the need to write some Erlang MapReduce queries for Riak and it was a bit of an adventure. The Riak MapReduce documentation is good but generally focused on JavaScript. If you’re using Riak it’s quite possible you’ve never had the need to use its’ MapReduce capabilities. We haven’t really used it at Boundary before I dug into some performance problems and it’s probably not regarded as one of Riak’s strengths. With that said though it’s a nice feature and was worth some investigation.


    Slow code

    To provide a bit of context let me first describe the performance problem I was investigating. Boundary customers were experiencing poor response time from a service that is responsible for managing metadata for Boundary meters. The service is called Metermgr and it’s a webmachine/OTP application that relies on Riak for persistence and exposes meter metadata with a REST interface.

    I noticed that as the set of meters for an organization grew there appeared to be a simple regression in a...

    Show more...

  • Erlang MapReduce Queries, MultiFetch and Network Latency with Riak

    Posted by on June 25th, 2014

    I know, you’re looking at your calendar, let me be the first to assure you it’s not 2011. I recently had the need to write some Erlang MapReduce queries for Riak and it was a bit of an adventure. The Riak MapReduce documentation is good but generally focused on JavaScript. If you’re using Riak it’s quite possible you’ve never had the need to use its’ MapReduce capabilities. We haven’t really used it at Boundary before I dug into some performance problems and it’s probably not regarded as one of Riak’s strengths. With that said though it’s a nice feature and was worth some investigation.


    Slow code

    To provide a bit of context let me first describe the performance problem I was investigating. Boundary customers were experiencing poor response time from a service that is responsible for managing metadata for Boundary meters. The service is called Metermgr and it’s a webmachine/OTP application that relies on Riak for persistence and exposes meter metadata with a REST interface.

    I noticed that as the set of meters for an organization grew there appeared to be a simple regression in a certain queries response time. For queries with as little as 200 keys response time was between 2 – 4 seconds. After taking a look at the code I was able to pinpoint the cause of the slowdown to a function called multiget_meters. Unfortunately this function didn’t multiget anything rather it iteratively fetched them one by one, oof.



    Anyway, my initial thought was, “I’ll just use MultiFetch.”

    Does Riak support MultiFetch/MultiGet?

    If you’re familiar with the more popular Riak clients or search around the internet for “riak multiget” you might get the impression that Riak supports retrieving multiple values in a single HTTP or Protocol Buffers request, sometimes referred to as “multiget or multifetch”.

    Unfortunately that’s not the case, take a look at the source you’ll see that Riak itself doesn’t support these capabilities. Rather some Riak clients provide this functionality by parallelizing a set of requests and coalescing the results. The riak-java-client is one such example.



    Having had experience with the Java client I incorrectly assumed that the official Erlang client had a similar implementation but if you check out the source you’ll notice it doesn’t support MultiFetch. I did a bit of archeology and found there are a lot of posts with questions and requests around implementing multifetch in the Riak Erlang client. Most of these posts point the user towards using MapReduce. The most useful thread I could find on the subject can be found here, not surprisingly it is entitled multi-get-yet-again!


    MapReduce in Riak

    Implementing MultiFetch in Erlang wouldn’t be too difficult but several users reported very good performance using the MapReduce approach with the only caveat being:

    1. I heard MapReduce in Riak is slow (hearsay etc…).
    2. MapReduce queries in Riak clusters are run with a R=1.

    Unfortunately the latter is a serious problem and I would like to see it addressed but for now let’s disregard this as it’s outside the scope of the discussion. It’s fine, take him outside and show him the pool, get him a cookie, he’ll be fiiiiiiine, etc….

    The MapReduce docs on Basho’s website are pretty good but there’s a lot of data to sift through in order to find the most relevant pieces of information to get started quickly. After doing so though I’m pleased to say using Erlang MapReduce queries with Riak is quite easy and there’s really only 2 important pieces of information you need to know to get started.

    1. Riak has built-in Erlang MapReduce functions and you can use these to address many common use cases. You should learn how to use these first.
    2. You can write custom Erlang MapReduce functions but you need to compile and distribute the object code to all riak nodes.

    As noted in the docs the basic MapReduce function riakc_pb_socket:mapred/3 takes a client, a list of {Bucket, Key} tuples as input and a list of Erlang Queries. Let’s dig into the Query a bit more, it looks like the following

    {Type, FunTerm, Arg, Keep}
    
    Type - is an atom and is either map or reduce
    FunTerm - a tuple 
      for built-in functions use : {modfun, Module, Function}
      for custom functions use : {qfun, Fun}
    Arg - Static argument (any Erlang term) to pass to each execution of the phase
    Keep - True/False - Include results in the final value of the query
    

    The examples in the documentation focus heavily on writing your own qfun queries, though as I mentioned you can’t just use qfun without some upfront work, the documentation notes.

    Screen Shot 2014-06-25 at 6.44.01 PM

    In addition, there is another paragraph that in the section called “A MapReduce Challenge” that states.

    Screen Shot 2014-06-25 at 6.46.47 PM

    In summary, if you want to write custom MapReduce queries in Erlang you need to compile and distribute your code to Riak nodes. I’ve gotten so comfortable using erl as a REPL that I glossed over this and assumed I could simply pass functions references and they’d be evaluated. If you don’t take the time to read and fully understand the documentation you might skim past those qfun requirements and just start writing your own custom queries like me and this guy. Combine that with the fact that qfun MapReduce error messages are generally quite opaque and that can lead to a bit of frustration when getting started.

    I’d prefer the documentation break out the difference between built-in and qfun queries more clearly and focus on modfun examples initially with a separate qfun section, preferably with a big red callout yelling “Hey Dummy, don’t try this yet”. The JavaScript MapReduce API doesn’t suffer from this limitation of course because it’s JavaScript and is interpreted via the Spidermonkey JS engine that ships with Riak. Perhaps that and the recent popularity of JavaScript is why it is given much more attention in the docs.


    Simulating MultiFetch with Built-In MapReduce Queries

    So back to the point it’s best we understand the built-in queries before we go any further. Here’s a quick walk through of the default map functions that are provided.

    map_identity - Return a list of riak_object for each bucket/key
    map_object_value - Returns a list of values stored in each key (calls riak_object:get_value(RiakObject)) 
    map_object_value_list - calls riak_object:get_value(RiakObject) assumes get_value returns a list, returns a merged list
    

    There are reduce phases as well, but to achieve multifetch like capabilities we only need to concern ourselves with the map_object_value map function. We can achieve our original multifetch use case by substituting.

    for

    As expected a quick set of tests against the production cluster and we’ve reduced the query from 2 – 4 seconds down to an acceptable (albeit not blazingly fast) average of approximately ~115 milliseconds.


    Comparing to MultiFetch in Java

    These results of course got me thinking about how Erlang mapred would perform compared to MultiFetch in Java on the JVM and as such I decided it was worth gathering some data. I constructed a test for 20, 200, and 2000 keys (this is not a benchmark) and ran each of the 3 tests 100 times, gathered samples and calculated the average and variance. I ran the tests on a server in the same data center and on the same broadcast domain as the Riak cluster. As to be expected MultiFetch outperformed mapred and the latency of MultiFetch (as noted by Sean Cribbs and the Riak documentation) was more predictable.

    Response time in ms where network latency ranges between 0.1 – 0.4ms

    As the number of keys increased by orders of magnitude query response time becomes less predictable with both approaches though MapReduce’s variance is greater. Many raw samples with MapReduce fell within ~600ms but there also several samples between ~900ms and ~1400ms.


    When might MapReduce be faster?

    This had me wondering if there are any situations where MapReduce might be preferable to MultiFetch or should I always just use MultiFetch? It seems to be the prevailing sentiment, most in use by clients and even Basho sometimes seems reticent about suggesting the use of MapReduce. I decided to run the the same set of tests but this time I ran them from Metermgr running locally on my laptop connecting to the production Riak cluster over the VPN.

    Response time in ms where network latency ranges between 100 – 300ms

    While the results are somewhat expected they are interesting nonetheless. Initially with a key set of 20 MultiFetch overcomes the added network latency and outperforms MapReduce but as the key set grows by an order of magnitude the average MapReduce query time outperforms MultiFetch by a factor of 2. Average variance remains less predictable in MapReduce because adding network latency doesn’t affect the variance we experienced at sub-millisecond latency.

    We all know situating your application servers near your database is important for performance, but in an age of “hosted this and that”, “PaaS” and “DBaaS” as a developer you may end up using a database or service where network latency becomes a factor. In the above example using a MultiFetch approach network latency is compounded as the input set grows, whereas MapReduce takes that hit only once, hence the improved average response time.

    I would of course be remiss if I didn’t mention that Boundary is an exemplary tool to monitor performance of such different techniques and can provide 1 second resolution of average response time for Riak Protocol Buffer queries whether they are within the same data center or across the internet.

    Where to go from here?

    Well, I’ve got a solution for my performance problem that meets my near term needs. I’m interested into digging into alternative clients and seeing if a MultiFetch implementation for Riak exists in Erlang, if I don’t find one I like I will write my own. I also believe it’s incorrect to say “MapReduce in Riak is slow”, in fact under certain input constraints and configurations it is not only acceptable it is preferable to the MultiFetch approach, if latency predictability is not too much of a factor. The problem is more nuanced than “should I use MapReduce” and it’s more abstract than MapReduce and Riak. It is about read techniques and their performance within the constraints of a distributed system. There are problems and there are tools, we need to use the right tools to solve a problem in certain situations.

    I’m looking forward to digging into more custom Erlang queries and can already envision situations where Riak MapReduce might be favorable. Finally if you’re using Riak but haven’t dug into this custom MapReduce queries because you’re not comfortable with Erlang then it’s about time you learn you some.

    Special thanks to @pkwarren for peer review, without his grammatical support this post would be unreadable


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • Web-Scale IT – “I know it when I see it…”

    Posted by on May 27th, 2014

    Recently at Boundary, we’ve been talking a lot about “Web-Scale IT”.  One of the first questions we usually get is,  “What exactly is Web-Scale IT?”  Cameron Haight of Gartner first coined this term in a 2013 blog  and said,  “What is web-scale IT?  It’s our effort to describe all of the things happening at large cloud services firms such as Google, Amazon, Rackspace, Netflix, Facebook, etc., that enables them to achieve extreme levels of service delivery as compared to many of their enterprise counterparts.”

    But when we answer this,  we are tempted to fall back on cliche.   In a famous opinion offered by Justice Potter Stewart in the 1964 case of Jacobellis vs. Ohio,  Stewart wrote;

    “I shall not today attempt further to define the kinds of material I understand to be (pornography)…But I know it when I see it…”

    That’s how we feel about Web-Scale IT, we have a hard time defining it, but we know it when we see it!

    We see it when we walk into an enterprise and hear more about the cloud than the datacenter.  We see it where release cycles are measured in weeks versus...

    Show more...

  • Web-Scale IT – “I know it when I see it…”

    Posted by on May 27th, 2014

    Recently at Boundary, we’ve been talking a lot about “Web-Scale IT”.  One of the first questions we usually get is,  “What exactly is Web-Scale IT?”  Cameron Haight of Gartner first coined this term in a 2013 blog  and said,  “What is web-scale IT?  It’s our effort to describe all of the things happening at large cloud services firms such as Google, Amazon, Rackspace, Netflix, Facebook, etc., that enables them to achieve extreme levels of service delivery as compared to many of their enterprise counterparts.”

    But when we answer this,  we are tempted to fall back on cliche.   In a famous opinion offered by Justice Potter Stewart in the 1964 case of Jacobellis vs. Ohio,  Stewart wrote;

    “I shall not today attempt further to define the kinds of material I understand to be (pornography)…But I know it when I see it…”

    That’s how we feel about Web-Scale IT, we have a hard time defining it, but we know it when we see it!

    We see it when we walk into an enterprise and hear more about the cloud than the datacenter.  We see it where release cycles are measured in weeks versus quarters.  We see it when tools like Chef are used for deployment.  We see it when we are talking to the head of DevOps.  Where there are sprints but not waterfalls.  Where the team is talking about continuous deployment, provisioning instances and open source components instead of next year’s release, hardware acquisition, and packaged software. When we see these things, we know we are seeing Web-Scale IT happening.

    The funny thing is, we see Web-Scale IT everywhere we look.  From the newest start-ups to the most conservative enterprises.  Web-Scale IT is not just for the Amazon’s, Google’s and Netflix’s of the world.  We see it at Fortune 500 insurance companies, health care companies and manufacturers.  At media companies, SaaS start-ups and service providers. In enterprises of every shape, size and flavor.

    Gene Kim, commenting on the adoption of DevOps in the enterprise, recently wrote in the CIO Journal,

    “The important question is why are they embracing something as radical as DevOps, especially given the conservative nature of so many enterprises? I believe it is because the business value of adopting DevOps work patterns is even larger than we thought. And those not transforming their IT organizations risk being left behind, missing out on one of the most disruptive and innovative periods in technology.”

    We couldn’t agree more. The confluence of Cloud, DevOps, Open Source, and competitive pressure have put us at a cross-roads in the history of Information Technology.  Web-Scale IT lets us build better applications, faster.  It lets us change them quicker.  And it lets us scale them more cost effectively and agilely.

    There is no doubt in our mind that Web-Scale IT is here to stay.  But Web-Scale IT is not without its challenges.  One of these challenges is ensuring high levels of service quality and delivery.  Boundary’s customers are some of the leading adopters of Web-Scale IT, whether they call it that or not.  We are excited to provide them a critical service  that helps them  successfully cope with the challenges of operating in this new, compelling environment, allowing them to anticipate and solve problems faster, and to keep up with the pace of application and infrastructure changes that are typical of Web-Scale implementations.

    So while it might not be easy to define Web-Scale IT, we know it when we see it, we are seeing it everywhere, and we are doing our best in helping our customers to make it deliver on its huge promise.


    Tweet this!
    Share on LinkedIn
    Send to a friend
  • A “Quantum Theory” of IT Monitoring

    Posted by on May 20th, 2014

    There are certain things, which are true in the quantum world, but just make no sense in our reality.  I remember in a college advanced physics course, having to calculate the likelihood that a baseball thrown at a window will pass through, emerge on the other side and leave both the ball and the window intact, due to quantum effects and tunneling.  I was astonished to see that while the odds of this happening are infinitesimally small, they are not zero.  Never mind the fact that you’d have to continuously throw the ball at the window, not accounting for breakage, for longer than the universe has existed to even have a remote chance of observing this, the odds are not zero and can be calculated.   And at the sub-atomic level, not the physical object one, this  type of behavior isn’t just common, it is expected.  This small fact has stuck with me for decades as a great illustration of how odd the quantum world truly is.

    What then does that possibly have to do with IT Monitoring?  It might be a stretch, but I think the new world of applications, which we call...

    Show more...

  • A “Quantum Theory” of IT Monitoring

    Posted by on May 20th, 2014

    There are certain things, which are true in the quantum world, but just make no sense in our reality.  I remember in a college advanced physics course, having to calculate the likelihood that a baseball thrown at a window will pass through, emerge on the other side and leave both the ball and the window intact, due to quantum effects and tunneling.  I was astonished to see that while the odds of this happening are infinitesimally small, they are not zero.  Never mind the fact that you’d have to continuously throw the ball at the window, not accounting for breakage, for longer than the universe has existed to even have a remote chance of observing this, the odds are not zero and can be calculated.   And at the sub-atomic level, not the physical object one, this  type of behavior isn’t just common, it is expected.  This small fact has stuck with me for decades as a great illustration of how odd the quantum world truly is.

    What then does that possibly have to do with IT Monitoring?  It might be a stretch, but I think the new world of applications, which we call Web-Scale, is in some ways as strange to traditional monitoring products as the world of Quantum behavior is to baseballs, windows and normal humans.

    Let me explain.   In the past, we built applications that were not quite so sensitive to small changes in infrastructure performance, for two main reasons.  First, our users had very low expectations.  From batch, to time sharing, to PC networks, to early web applications, we became accustomed to waiting for a screen to advance, an hour glass to spin, a web page to update.   But somewhere along the couple of years or so, our expectations have changed.  Movies stink when they stall, missed stock quotes can cost us real money, and we voraciously hang on our phones and tablets for real-time updates of everything from sporting events to natural disasters, to pictures and updates from loved ones, to new orders from customers.

    Second, we just got tired of the standard practice of over-provisioning data centers for peak loads, running at 50% capacity or less to ensure performance.  Despite falling hardware costs, our appetites for data and applications just kept growing.  So we virtualized everything, and when we tapped out the efficiency there, just like we stopped building power plants at office buildings decades ago, we went to the cloud, where we could “scale” on demand, and share the economies of scale of computing experts.

    Yet while the entire infrastructure changed, and the costs of performance delays and degradations increased, we happily kept monitoring things every five minutes or so, or even every hour checking for the same things we used to, capacity, resource utilization, and the like.  Yet  today users scream and customers leave over  5 second delays.  Outages of streaming information costs us money.  Our “quantam” of time we care about has shrunk dramatically to match the needs of the new application infrastructure, applications and user expectations.  We live in a real-time world, yet we continue to monitor our last architecture.

    Which brings me to another engineering theorem deep in my memory.   The Nyquist–Shannon sampling theorem, which in its simplest form, says that in order not to lose information, the sampling frequency that you measure at needs to be at least 2x as fast as the event you want to capture.  If any slower, your reconstructed signals suffers from “aliasing”, or loss of information.

    Today’s Web-Scale IT architecture and demanding users, care about changes and delays that last a few seconds, sometime even less.   If our quantum of caring, is now measured in a second or two,  Nyquist, and common sense, says we better be capturing  and processing monitoring data every second or so also.

    Last generation IT monitoring solutions simply CAN’T capture and process data fast enough.  They can stare all day at the baseball, but it will never tunnel through the window.  But unlike our quantum baseball example, the slow sampling of Infrastructure monitoring data leaves us blind to things that happen that we actually care about.  Stalled video, missed quotes, lost business opportunity, service delays and outages that cost us money.

    Our new math of IT monitoring needs to measure in seconds, it’s as plain and simple to see as the shattered window that I am staring at right now.


    Tweet this!
    Share on LinkedIn
    Send to a friend