Maintaining a Kubernetes cluster is an ongoing challenge. While it tries to make managing containerized applications easier, it introduces several layers of complexity and abstraction. A failure in any one of these layers could result in crashed applications, resource overutilization, and failed deployments.
We’ve recently rolled out several long-overdue improvements to the Papertrail™ dashboard. The old dashboard was adequate, but it didn’t do a great job of showing larger accounts with many groups, searches, or systems. Lack of personalization was another issue. For example, it assumed every user cared equally about every group in the account. The new dashboard addresses these problems with a flexible design that is designed to tailor to individual preferences.
When your infrastructure doesn’t offer the scalability to add hardware and applications without huge monetary investments, you can turn to cloud hosting. Microsoft Azure caters to businesses with mainly Windows environments and hosting can be difficult to monitor as you scale up resources. As you add more VMs and applications to your cloud, you may struggle to keep track of logs across the entire network. Every time you create a new VM, upload a new application, develop a new website, build a new database, or any other new resource, Azure produces a variety of logs stored in different locations. This can make it difficult to find the necessary information for monitoring your services or troubleshooting problems.
I remember the days when I’d develop using simple Linux command line tools. When I worked at Amazon almost 10 years ago I used a lot of old-school tools like Vim, ssh, and grep. It took some time to get familiar with them, but I figured them out just by reading a manual page or watching my coworkers. For better or worse, developer and ops tools are getting so complex we have to read books to become proficient. Newer tools offer more features and scalability, but it comes at the expense of complexity to learn and manage. How should we decide when to stick with simple tools and when to invest in more powerful or complex ones?
With the popularity of microservices, cloud integration, and containers, the distribution of log files can get out of hand. If you have several dozen applications distributed across the cloud, it gets difficult to aggregate and review logs when something goes wrong. When you distribute applications in this way, log aggregation is more important than ever to quickly analyze and fix problems.
When troubleshooting, a single search can become a theme with variations. Even if it’s not worth saving, it’s worth remembering, for a while. Until today, Papertrail didn’t make getting back to recent searches especially easy, but that’s changing.
We’ve been working on a few new features that will dramatically increase the speed at which logs are searched and enhance visibility into log volume. These features are immediately available to new customers and will be rolled out to existing customers in the coming weeks.
Papertrail now officially supports automated setup of the remote_syslog2 logging daemon with Chef, Puppet, and Salt:
Papertrail has had the ability to alert on searches that match events for years, but what about when they don’t? When a cron job, backup, or other recurring job doesn’t run, it’s not easy to notice the absence of an expected message. But now, Papertrail can do the noticing for you. Today we’re excited to release inactivity alerts, offering the ability to alert when searches don’t match events.
Ever wanted to quickly see which systems haven’t sent logs recently? Now it’s as easy as checking a traffic light. Visit the Dashboard and click a group name, then scan the list of systems: