Papertrail

The revolution will be verbosely {,b}logged

Hello cron job monitoring & alerts, goodbye silent failures

Posted by @coryduncan on

Papertrail has had the ability to alert on searches that match events for years, but what about when they don’t? When a cron job, backup, or other recurring job doesn’t run, it’s not easy to notice the absence of an expected message. But now, Papertrail can do the noticing for you. Today we’re excited to release inactivity alerts, offering the ability to alert when searches don’t match events.

Set up an inactivity alert

From the create/edit alert form, choose “Trigger when no new events match”

Inactivity alerts

Once saved, the alert will send notifications when there are no matching events within the chosen time period. Use this for:

  • cron jobs
  • background jobs which should run nearly all the time, like system monitors/pollers and database or offsite backups
  • lower-frequency scheduled jobs, like nightly billing

Try it

If you have cron jobs, backup jobs, or other recurring or scheduled jobs, they almost certainly already generate logs. Here’s how to have Papertrail tell you when they don’t run or run but don’t complete successfully:

  1. Search for the message emitted when a cron job finishes successfully (example: cron)
  2. Click “Save Search”
  3. Attach an alert, like to notify a Slack channel or send an email.

No logs? No problem.

Very rarely, a recurring job doesn’t generate log messages on its own. For those, use the shell && operator and logger to generate a post-success log message. For example, ./run-billing-job && logger "billing succeeded" will send the message billing succeeded to syslog if and only if run-billing-job finishes with a successful exit code. Use "billing succeeded" as the Papertrail search.

What do you think?

Give inactivity alerts a try and if you have questions or feedback, let us know.

Green means go(od): Spot inactive log senders at a glance

Posted by @lyspeth on

Ever wanted to quickly see which systems haven’t sent logs recently? Now it’s as easy as checking a traffic light. Visit the Dashboard and click a group name, then scan the list of systems:

Systems with activity status

  • Green-light systems are currently sending logs
  • Yellow-light systems aren’t currently sending logs, but have sent logs in the last 24 hours
  • Red-light systems haven’t sent logs in the last 24 hours

Text on the right shows when Papertrail last saw logs.

It’s an easy way to make sure critical systems are still logging after a deployment or upgrade.

Try it out, and if you see anything unusual, or just want to opine on the intervals or colors, tell us.

Advanced event viewer keyboard shortcuts

Posted by @coryduncan on

Today we’re excited to release two new keyboard shortcuts within the event viewer:

Highlight and link to multiple events

When a series of events is relevant, it can be useful to share those events with teammates. This is now possible.

Holding Shift will put the event viewer into “selection mode.” While holding down Shift:

  • Start a selection by clicking the selection button button next to an event.
  • Select a range by clicking a selection button above or below an existing selected event.

Event selection

The browser URL will update to indicate which events are selected. From there it’s as easy as copying and pasting the link. Clear a selection by pressing Esc.

Retain search query when clicking

Extending the idea of flexible context to the entire event viewer, holding Alt while clicking a link will retain (instead of replace) the current search query. This works for orange and blue context links as well as click-to-search.

Retain search query

To see these and all other keyboard shortcuts, press ? while in the event viewer. Try out the new shortcuts and as always, let us know if we can do better. Enjoy!

Use Zapier to send logs anywhere

Posted by @lyspeth on

Papertrail’s search alerts are great, but what happens when you need a specialized integration, or want to grab something other than raw messages and counts – like particular fields from a message, or data to analyze later?

Now, you can invoke a Zapier action using a webhook trigger, which can then perform any desired action in Zapier. This example Zap setup sends data on printer service behavior to a Google Sheet for later analysis.

Set up the Zap

Sign up for Zapier, or log in.

Create a new Zap. Under Built-In Apps, select Webhooks by Zapier, then Catch Hook, and save the new Zap.

create_zap.gif

In the Pick off a Child Key dialog that appears:

child_key_dialog.png

enter payload.events to get to the event details and Continue to show the Zap’s webhook URL.

Set up the alert

Create a saved search to find the lines of interest, then add a Zapier alert integration. Grab the webhook URL and paste it into the alert, then save.

setup_zap_alert.gif

Process data

Create a Google spreadsheet with column names for the relevant fields: received_at, source_name, program, message.

printer status data

and test the data-sending step by clicking Send test data on the Papertrail alert, and in Zapier, clicking OK, I did this.

send_test_data.gif

Once the test has succeeded, set up the action. Select the Google Sheets app, then Create Spreadsheet Row. Select the spreadsheet and worksheet created earlier, and fill in the fields with selections from the payload.

set_up_sheets.gif

Voila! A spreadsheet that will dynamically update with details from the matched events when the Papertrail alert fires.

If custom alerts going right into your app of choice sounds great, give it a try and let us know your thoughts.

Improved Log and Account Access Permissions

Posted by @rpheath on

Starting today, it’s possible to grant a user access to logs from certain senders/groups (within the same Papertrail organization). Additionally, we’ve added specific permissions for managing users, changing plans, and purging logs.

Here’s an example, where an administrator is changed to have ready-only access to certain groups:

Group permissions screencast

What’s possible?

Papertrail’s granular access control and permissions allow:

  • Companies to segregate access by responsible team, like granting access to logs from a staging environment or a specific product.
  • Consultants and hosting providers to provide limited access to many customers, while still managing all logs themselves.
  • Admins or accounting teams to handle less-common changes like adding users, changing plans, and purging logs.

These new permissions keep access clean within a single organization and may reduce the need for multiple organizations. Some may still benefit from having multiple organizations, or a combination of both multiple organizations and granular organization-specific permissions.

Give it a try

To change permissions, visit the Members section. And as always, we’d appreciate hearing your ideas.

Event actions: flexible context, fast troubleshooting

Posted by @coryduncan on

Today we’re excited to release new ways to act on specific events, including seeing surrounding and related context, copying a deep link URL, and transitioning to the command-line.

Event actions

With this new feature, event actions, one can:

  • Link to an event in the same context you’re viewing. An easy way to share an event with a teammate or save it for reference.

  • Transition to the Papertrail CLI at the same point the event occurred.

  • Show an event within a different context, like a specific system, program, or group. Switching contexts allows quick examination of multi-line events or multi-system incidents.

Because text selection is often an integral part of log-based troubleshooting, the event actions “+” button doesn’t change text selection behavior. Select text as you normally do, even directly over the button, and it’ll work as normal.

Flexible context

In most situations, showing surrounding context means removing the current search query. It’s a bit like zooming out: show me a specific event in the context of all events from a given log sender, program, sender and program, or group. This makes it possible to transition between any set of related logs without losing the specific message you’re interested in.

When you’re certain that all relevant events match your current search, the context links can show surrounding context while retaining the current search query.

Imagine I’m looking at events matching the search format=html. I want to see a specific event in the context of a different set of events that also contain format=html. In this case, I’m interested in seeing matching events from a specific program (app/web.1):

Event actions context

The next time you need more context around an event, try out event actions and let us know if you have feedback. Enjoy!

Click-to-Search: Teaching an Old Log New Tricks

Posted by @rpheath on

Wouldn’t it be great if you could drill down into your logs just by clicking? Today, we’re excited to release a feature that lets you do exactly that. We call it click-to-search.

Here are a few examples of how this will make life easier:

  • Web access logs: Each line in your access log contains an IP address. To click the address to see other lines containing the same address, enable “IP Addresses”.
  • Custom app logs: Your custom application logs contain User IDs with the format “user_id=1234”. Creating a custom clickable element with the regular expression user_id=\d+ would make the string “user_id=1234” clickable.

Papertrail offers the following clickable elements out of the box:

  • IP addresses (enabled by default)
  • Email addresses
  • GUID / UUID
  • Period separated words (domains, file names, etc)

For these items, just click a checkbox to activate the elements that apply to your logs.

Since all logs are different, we’ve made it simple to create your own custom clickable elements:

Click-to-search will improve troubleshooting flow and make log filtering easier. We hope you find it useful, and we always welcome feedback on how to make it better. Enjoy!

Log Destination IP will change December 20

Posted by @papertrailapp on

Summary

Update: The new DNS records are now active.

The DNS records for Papertrail’s first four log destinations (logs.papertrailapp.com, logs2.papertrailapp.com, logs3.papertrailapp.com, and logs4.papertrailapp.com) will change on Tuesday, December 20, 2016. The new IP addresses will be in the CIDR block 169.46.82.160/27.

Important: Papertrail will continue accepting log messages sent to the old IPs.

Does this affect me?

Probably not. Loggers will continue logging to the old IPs until they’re restarted, at which time a new DNS lookup will take place. However, if the network uses IP-based egress filtering, the egress filters will need to include the new addresses by Tuesday. (Very few networks filter outbound traffic in this way.)

What if we aren’t using DNS?

For any systems sending logs directly to an IP address, no action is needed. The log destinations will continue listening on the old IPs until further notice.

Questions

Please email support@papertrailapp.com if there’s anything we’ve missed.

Event Viewer Display Options

Posted by @coryduncan on

Papertrail’s event viewer displays the full text of your events including the timestamp, log sender, and program. Sometimes, one log element can get in the way of what you’re looking for.

We’ve added a new way to control how messages appear, as a link labeled “Options”:

Event Viewer Options

Clicking the Options link exposes a menu of display preferences that can truncate messages for skim-ability and show or hide 3 different log elements.

Truncate Message

Enabling “Truncate Message” will trim log messages to a single line. When messages are truncated, clicking any message will show the full message again.

This option can be beneficial when:

  1. many logs use similar or identical formatting, and the most important details are near the beginning of the message
  2. viewing logs on a small screen or window

Truncate Message

Hide Time, System, Program

These options will hide any combination of time, log sender, and program. Once hidden, the label is collapsed to a dot and hovering over that dot shows a tooltip of the hidden label.

In addition to maximizing space, these options can be handy when logs:

  1. have long or low/no-value log senders or programs
  2. have an implied date (or a timestamp already included in the log message)
  3. are using a W3C, or similar, character-offset-based log format

Hide Time, System, Program

Options Persistence

All display options will persist for the current browser window/tab. As a result, multiple instances of the event viewer can run in different tabs and the options can be tailored for each tab.

Questions

If you have any ideas or questions about this feature, please get in touch.

Navigate Faster With the New Finder

Posted by @rpheath on

Summary: Navigate to any log view from anywhere by clicking the new icon in the upper left corner.

Finder Mag Icon

What changed?

The dashboard has always had a way to quickly find a specific saved search, group, or system:

Old Dashboard Search

But the reality is, you might not be on the dashboard when you want to find something. Well, as of today, you no longer have to be. There’s now a new search box at the top left of every page:

Finder

The same “Hit s and type” shortcut works as it always has on the dashboard, but is now accessible from anywhere within Papertrail.

Customers with only a few searches, groups, and senders don’t even need to type. Just click the icon or press the s shortcut to see all available log views:

Finder Small Accounts

Oh! And there’s a bonus: the new finder brought edit links along with it:

Finder Editing

Saving unnecessary clicks and page loads is important, especially when troubleshooting a problem. Every second counts. We hope this change makes finding what you need—and when you need it—much more efficient.

As always, let us know if you have questions or ideas for a better finder.

A Smarter Event Viewer

Posted by @coryduncan on

Event focus

We’re excited to release some subtle but powerful updates to Papertrail’s event viewer that make searching and sharing logs much easier:

  • Searches now stay centered around the time you’re looking at, so step-by-step troubleshooting is faster.
  • Event viewer URLs now link to exact positions, so colleagues always see exactly what you see.

Never lose your spot

When searching through logs, it’s common to start with a broad search and gradually edit the search query to refine the results. Previously, a refined search would start searching again from “now” even if you had scrolled to results from the past.

Now when you edit an existing search query, the results will be based on the time of the events you are currently viewing. This should give you a quicker and more accurate search experience. Of course when you do need to manually set a search time, that option is still available.

You see what I see

When all your logs are in one place, it’s easy to link to and share important events. We’ve made this as simple as copying and pasting the URL of your search. However, in searches that returned a lot of events, the URL might not indicate exactly what events you were looking at when sharing.

Now, you can share an event viewer URL with your team in confidence, knowing that whoever visits that URL will see the same set of events, in the same position, as was shown when the URL was generated.

How does it work?

These changes are automatically enabled in the event viewer. Keep doing what you’re doing! For example, when you perform a search, then scroll to a different position, and then perform a second search, you’ll be at the same time. And if you copy an event viewer URL and return to it, you’ll be looking at exactly the same log message.

What do you think?

These are subtle changes from how our event viewer has worked in the past, but we hope you’ll find they feel quite natural and improve your experience with Papertrail. Try it out and please let us know what you think. Thanks!

April 14: New SHA-2 TLS certificate for log destinations

Posted by @troyd on

On Thursday, April 14, 2016, Papertrail will deploy a new SHA-2 (SHA-256) TLS/SSL certificate for its syslog destinations, replacing the current SHA-1 certificate.

Update 2016-04-25

On April 14, Papertrail deployed a new SHA-2 certificate, discovered that older versions of remote_syslog2 (0.13 and prior) did not accept the certificate, and reverted to the prior SHA-1 certificate.

Part of our job is to be our customers’ eyes and ears for changes like this. To that end, Papertrail will:

  • present the new SHA-2 certificate for about 6 hours on April 27, 2016
  • log all failed connection attempts due to TLS negotiation failures
  • revert to the existing SHA-1 certificate
  • email affected customers

The certificate will be deployed permanently on ​*Wednesday, May 4​*, 2016.

If we can answer any questions or help, let us know.

Update 2016-04-14

During the migration, we were reminded that older versions of remote_syslog2 pre v0.14 don’t support SHA-2. Based on this new information, we are reverting the TLS certificate and will post a revised deployment plan next week.

Impact

For nearly all senders and all common log senders (including remote_syslog2 and rsyslog), this will be a non-event. OpenSSL has supported SHA-2 since 0.9.7m and enabled SHA-2 by default beginning with 0.9.8l (released on 2009-11-05).

The only senders which Papertrail knows do not accept SHA-2 (SHA-256) certificates are those running:

Windows 8, 7, Vista, Server 2012, and Server 2008 are not affected.

Why is this necessary?

The Wikipedia entry for SHA-2 explains the reason for this change:

Although (as of 2015) no example of a SHA-1 collision has been published yet, the security margin left by SHA-1 is weaker than intended, and its use is therefore no longer recommended for applications that depend on collision resistance, such as digital signatures.

Papertrail’s Web site already serves a SHA-2 certificate. This change only affects Papertrail’s syslog endpoints.

Questions

If we can help test an old device or otherwise save you time, we’re at your service: support@papertrailapp.com.

Subtle Refinements to Papertrail's Event Viewer

Posted by @rpheath on

We’re pleased to release several improvements to Papertrail’s event viewer. Based on our own experience and how we’ve seen customers use Papertrail, these changes make the viewer easier for new users to incrementally explore, then more predictable once you have.

One place to choose what to see

There are three ways to control which logs the viewer shows: groups (of log senders/systems), search queries, and time.

Until now, one of these options was buried in the upper left corner, hundreds of pixels away from the other two:

Old Group Filter

The idea was that the dropdown list of groups in the upper left corner would frame my view, in the same way that the title of a page might. The problem: the group of log senders is used alongside the search query and the time, which are both at the bottom of the screen. Also, so many sites use the upper left corner for site-wide navigation or decoration that Papertrail’s group dropdown was easy to miss.

Realizing this, we moved all three scope-constraining options to one place. We hope this saves mouse mileage.

Bonus:

  • the existing icon to access/change saved searches will glow blue when an existing saved search is being used
  • edit any group or saved search right from the viewer:

Edit group from viewer

Quickly update existing saved searches

Previously, to overwrite a saved search, one needed to navigate to the saved search’s settings page and change the query in a form field. Usually I want to refine a saved search because I’m viewing the resulting logs, though. It’s almost never a task of its own.

To avoid the back-and-forth, it’s now possible to update searches from the event viewer. After searching for a query which isn’t currently saved, clicking “Save Search” now shows two options:

Replace Existing Search

This new “Replace an existing search” option makes it easy for search queries to evolve and improve as I explore my logs, so saved searches always reflect the team’s current knowledge.

Offer control of high-volume streams

When “tailing” a live stream, sometimes the stream will include more new log messages than would be sane to present at once. I’m not great at evaluating 250 events per second, let alone 2,500 or 25,000, and having them unceremoniously dumped on my screen wouldn’t help me debug a problem.

This is only relevant for high volume live tail streams, so Papertrail showed a subset of the new logs on the live stream and made all logs available for non-live views.

However, this had two gaps: I want an indication that I’m viewing a subset of live logs, and I’d like more control of what to do next – sometimes I spot a problem and do want to see more.

Now, when events are omitted from a high-volume live stream, Papertrail says so. Also, I can click to load omitted events:

Load Omitted Events

Moved Contrast setting to Profile

Papertrail’s viewer supports a dark and a light background. Previously, this was a button in the viewer. We’ve learned that contrast is more of a personal preference: once set, very few people want to change it casually, nor do we. It wasn’t a good use of space or cognitive load in the viewer. It’s now in Profile:

Profile Contrast Setting

What do you think?

Our design goals are gradual, effortless discoverability the first time something is needed, then minimum cognitive load on every future use. A recent 99% Invisible video about The Norman Door explains this incredibly well. Tiny decisions matter. In some cases, we’ve been testing these changes on ourselves and refining them for 2 months.

Take the updated viewer for a spin and send us your opinions and requests. Enjoy!

Never type the same API token twice

Posted by @coryduncan on

Typing the same alert settings into multiple alerts sucks. Browser autocompletion makes it tolerable, but it’s not ideal. To help with this, now when you create a new alert, you can copy details from one of your existing alerts. This is a quick way to set up multiple alerts that share details.

Clone Alert Details

Introducing Syslog Rate Limits

Posted by @jpablomr on

Summary

Occasionally, a misconfigured log sender will generate an astonishingly high volume of log data. Because UDP doesn’t offer backpressure, a misconfigured UDP sender can generate hundreds of thousands of packets per second with no regard to whether Papertrail accepts or even receives the logs. To any other service, this activity would be a denial-of-service attack. It’s our responsibility to ensure that such a misconfigured (or even malicious) sender can’t cause problems for other Papertrail customers, while also making everyone’s logging service as painless and predictable as we possibly can.

Until now, Papertrail has handled log floods reactively and manually. With a handful of incidents under our belt, we’re now comfortable using proactive rate limits to automatically identify and minimize the impact of these unintentional floods (particularly from UDP senders, where backpressure is not possible).

These syslog rate limit rules will go live on Thursday, February 11th, 2016. As explained below, customers should not see a difference in log delivery reliability from today. Also, in the future, Papertrail will periodically email customers who regularly reach the rate-limits simply to ensure that no one is surprised.

Our job is to make this simple and painless for you, so if we can answer any questions or explain in more detail, we want to know.

Updates

2016-02-11: Rate limits have been enabled.

A quick refresher on UDP

UDP is a great protocol for sending information with minimal overhead. Its simplicity and ‘fire and forget’ model make it a practical alternative to TCP for use cases where losing a few packets is not critical. Nevertheless, there is a catch: when UDP is dropped (due to network or device issues), the sender has no knowledge. Since the UDP sender doesn’t know that packets were lost, it can’t moderate its transmission rate. A UDP sender will continue to send as fast as it can, whether or not the recipient receives the data.

This awareness, called “backpressure,” is what lets TCP realize that a link is congested, and slow its sending rate.

When a UDP sender starts sending unusually large amounts of data, the probability that all the data doesn’t arrive increases. To picture this, imagine taking notes from a speech: if the speaker talks too quickly, you might not understand everything the speaker says or you might forget to write some parts down. The speaker doesn’t notice if you can’t hear what they say or if you fail to write everything down, they will continue speaking until they are done.

If your senders are configured to use UDP, a big spike of UDP messages might cause some of them to be dropped and forgotten along the way.

Why do we need to set rate limits?

Because we cannot apply backpressure to UDP senders, our only mechanism for fairly handling all customers is to apply rate limits for UDP that are above what we have seen as normal volumes for senders.

To avoid interfering with high-volume log senders, we based the limits on the peak log volume of Papertrail’s highest-volume senders (which Papertrail already measures as part of regular operations).

Separately, we have seen rare cases where a misbehaving TCP sender will cause issues by trying to connect tens or hundreds of times per second. With TLS encrypted syslog, the TLS handshake overhead can make this look a lot like a CPU exhaustion DoS attack. In extreme cases, these misbehaving senders could interfere with normal operation of Papertrail’s syslog destinations. There’s no graceful way to ignore 1 million packets per second or 10,000 failed TLS connections per second, so the question isn’t whether limits need to exist, just what they should be and how to implement them thoughtfully.

As we have grown, more misbehaving senders have required us to react to each incident and manually limit them until the senders are behaving properly. This is not an ideal solution, as manual tasks are error-prone and we’d rather spend that time improving the service. The rate limits we are introducing are just an automated form of our manual limiting policies.

Automated rate limits will let Papertrail:

  • Guarantee the normal operation of our service for all users.
  • Quickly determine misbehaving senders and help restore them to normal operation.
  • Spend time upgrading our infrastructure and working on new features.

What are the rate limits?

For UDP:

  • 3,000 messages per second per source IP.
  • 10,000 messages per second to a log destination (port).

For TCP:

  • 10 new connections per second per source IP to a log destination (port).

These limits can (and probably will) change, based on our regular measurements. If these limits present a problem, let us know and we’ll work with you. Again, customers won’t see changes from their current log message delivery reliability; these limits just automate what’s already happening. We’ll also be proactively contacting customers who regularly exceeds these guidelines simply so they’re aware.

What protocol should I use to send my messages?

Use TCP if:

  • You can’t miss a single log message.
  • A single sender IP address may generate more than 3,000 packets per second of logs regularly (and those logs are operationally relevant, not noise).
  • Communication between your senders and Papertrail needs to be encrypted.

Use UDP if:

  • Your syslog sender must be non-blocking.
  • Your syslog sender only supports UDP.
  • The benefits of using UDP outweigh the risks of losing messages.

Papertrail’s limits also try to expose the throughput which we believe each protocol is well-suited for. A single system generating 5,000 messages per second of logs via UDP (let alone 10,000 or 50,000) is likely to experience at least some loss, possibly even at the sender’s NIC buffer. If some loss during periods of high log volume is acceptable, UDP may still be fine.

On the other hand, customers regularly generating more than 3,000 messages per second from a single sender, and who want reliable delivery, will be happier with TCP - even without the rate-limits described here. remote_syslog makes changing protocols very easy, as do most other sending daemons.

Questions

We took a lot of time and care in determining a rate limiting policy which would only affect senders that might adversely affect Papertrail’s service. However, we understand that rate limits always introduce some concerns. If there’s anything we can do to help, like recommending a protocol, reviewing a high-traffic system, or providing a configuration for a different protocol, we want to help. Let us know at support@papertrailapp.com.