northben's blog

How to Make Faster Joins in Splunk


Although it's often possible--and recommended--to avoid the join command, sometimes it is necessary to use join. I was recently exploring the performance impact of the join command and I wanted to share my findings.

Splunk Configuration Management -- my progress so far


Part of being a responsible software engineer includes the systems engineering process of configuration management. Although backups and access controls are a necessary part of maintaining a secure IT system, relying on these processes for configuration management is inefficient and dangerous.

How to Monitor Splunk Index Growth Over Time

Although you can use the Splunk on Splunk app to monitor Splunk index sizes (and many other things!), you might be interested to monitor index growth over time as well. I'll show you how to do that.

Just for demonstration purposes, you can run this search to see the kind of data that we will collect. This uses the rest command to collect the current index metadata from the Splunk REST API. As you can see, I renamed a few fields just for asthetic reasons.

How to Delete Splunk Events When Using a Transforming Command


Recently, I needed to delete some events that matched certain summary conditions. For example, where the event count exceeds a certain threshold:

Example showing event search with stats criteria

Now, if you try to delete the events by appending | delete, you'll receive an error:

Error in 'delete' command: This command cannot be invoked after the non-streaming command 'stats'

A Truly Open Github?


Github is great and all, but it's still a proprietary organization (remember Sourceforge?). How about an open-source github -- where all of the computation and storage runs on a distributed network? The data for all repos could be stored on a blockchain database. MIners will perform execution that is normally performed by webservers today.

Like Bitcoin, your own PC could help run the blockchain, your company could run its own miner servers, or third-parties might run miners as a service for you.


A Wishlist for Operating System Developers...

I really like how the advent of mobile operating systems has allowed operating system designers to re-imagine how to create an operating system user interface. Isn't it great that even novice computer users can use pretty much any mobile operating system and common user interface behaviors are automatically intuitive and consistent--such as pinching to zoom or rotating a device, tap and hold, swiping. This is a good thing. We should have more revolutionary ideas like this in technology.

How to Utilize Post-Process Searches in Splunk Simple XML and HTML


It took me a while to figure out how to use a Post-Process Search in a Splunk Dashboard, so I thought it would be a good idea to remind my future self how it's done.

This is a Simple XML dashboard. It is essentially the same as the example in my last post. The full source code is attached to this post.

In order to use a Post Process search, only three changes are needed:

Dashboards are for Reporting, not Calculating


Since it is so easy to search for data in Splunk, and then create a dashboard in just a couple of clicks, you might be tempted to do just that -- and release your dashboard into production. For some situations, that's absolutely fine. But as your organization becomes more reliant on Splunk dashboards, this approach can become unwieldy. And if there's anything we want, it's wieldy searches!

Splunk: One Search or Two?


One of the most common scenarios I experience in Splunk is where I need to use data from two different indexes at once—typically in order to build management and reporting dashboards. With my background in developing applications on relational databases, my first attempts at this solution used the "join" command in Splunk. Once I realized that a combination of the "append" and "stats" commands can be a better choice, I started using those more. But today I will show an even better, faster approach!

How to delete duplicate events in Splunk


I use Splunk to report on business objects moreso than typical security operation data. For instance, helpdesk tickets rather than firewall logs. I have created various Python scripts to import these business objects from various REST and SQL sources, and I want these import scripts to be idempotent. That is, I want to import helpdesk tickets every day, but no more than once per day, regardless of how many times the import script is called.


Subscribe to RSS - northben's blog