Recently at work, I’ve used Splunk Enterprise extensively to analyze application logs. In Splunk, I’m able to query logs, trigger alerts when scenarios occur, and create dashboards to visualize logs for fellow engineers and business analysts. Splunk provides its own query language to interact with logs on its platform; Splunk Processing Language (SPL) is used for searching, alerting, and dashboard creation1.
Although Splunk querying was part of my work-life for years, I never explored it on my own or used it in any personal projects. Recently, I decided to run a local Splunk Enterprise server to experiment with its basic properties. This groundwork may lead to my adoption of Splunk for monitoring applications logs. In this article, I describe the basics of Splunk and how to set up a local instance on Docker. I also discuss basic SPL queries and dashboards, and how some queries can be run before you upload any logs of your own.
Splunk Enterprise is software that enables log analysis on application, infrastructure, and security logs. Splunk consists of a web interface, query language (SPL), and command line interface2. Users forward logs from different sources onto Splunk, where they are indexed and made available to query. Alerts, dashboards, and reports can be created on Splunk, which build upon SPL queries. With these tools, Splunk allows individuals and organizations to view and react to the status of logs in real-time or near real-time3.
Centralized Log Analysis Tools
A centralized log analysis tool brings together logs from across an organization or department into a single, searchable platform. Splunk is a popular example of a centralized log analysis tool. Centralized logging brings many quality-of-life improvements to engineers; it is considerably easier to search on Splunk for logs using a common query language instead of accessing individual servers where logs are initially generated.
Splunk is often compared to another log analysis platform called Elasticsearch, which is part of the ELK stack. I have written about the ELK Stack in prior articles, mostly because it was used at my prior company. The biggest difference between the two is Splunk requires a license while the ELK Stack is open source. Beyond that, I view the choice between the two platforms as a personal preference4.
Splunk Enterprise provides official Docker images on DockerHub. If Docker is installed and running on your computer, starting a Splunk server is as simple as the following two shell commands.
It may take a minute for the container to be ready. With the Docker container running, navigating to localhost:8080 in a web browser should show the following sign-in screen.
Sign in with the credentials configured in the shell command: admin for the username and splunkadmin for the password. After signing in, you are presented with the homepage for Splunk Enterprise!
Now we are ready to start using Splunk. Before forwarding any logs to Splunk, Splunk’s default indexes can be utilized to practice writing simple queries. An index in Splunk is a repository of data (i.e. logs), and the act of indexing transforms raw data sitting in files at an index into searchable events5,6. One index present on every Splunk installation is
_internal. From the Splunk homepage, clicking on "Search & Reporting" on the left-hand side presents a search screen. From here, entering the following query will retrieve events from the
On the left-hand side of the screen, a list of fields (under "Selected Fields" and "Interesting Fields") are shown which exist in events and are queryable on the current index. A slightly more complex query might perform statistical analysis on one of these fields. The following query aggregates the events and checks the count of each unique value in the
In the second line of the query above,
stats is an SPL command which performs statistical aggregations7.
count is an aggregation function for counting the number of events,
by is a clause to group the aggregation function (similar to a SQL
GROUP BY statement), and
source is the name of a field to group upon. All together, the command
stats count by source means "perform a statistical aggregation, grouped by the source field, where the number of events are counted."
Another basic query technique is to filter events within an index. One way to accomplish filtering is with the
where SPL command. For those experienced with SQL, the following query has the same functionality as a
WHERE <expression> LIKE <pattern> statement.
A more complex Splunk query may chain together multiple statements using pipes, similar in functionality to Unix pipes in shell commands. For example, the following SPL command pipes together multiple statements, returning all previously run SPL queries on the Splunk server.
Queries are also used to display visualizations, such as the following query which charts Splunk's memory utilization over time.
Understanding these queries alone will not to make you an expert at Splunk, but they should provide a general sense of how SPL queries are used. Complex queries can be written and experimented with before any custom indexes are created, which is great for beginners wishing to experiment. Splunk documentation provides great information on how to structure queries along with documentation on each SPL command8.
In order to query application logs, a custom index must be created. In a production environment, Splunk forwarders are used to send logs to a Splunk Enterprise server where they are placed in an index. For demonstration purposes, I'll skip the setup of a forwarder and manually upload log data to Splunk. To do this, start by navigating to the homepage and click on "Add Data".
On the following page, scroll down and click "Upload files from my computer".
A multi-step process is required to upload a file manually. For full details, check out the uploading tutorial data article provided by Splunk9. In my scenario, I uploaded a sample log file from my saints-xctf-api application (accessible from api.saintsxctf.com). This sample log is available in a saints-xctf-api.log file in my devops-prototypes repository.
While uploading my log file, I also created a corresponding Splunk index. This index, called
saints_xctf_api, is queryable just like the default indexes.
Even with a small log file like saints-xctf-api.log, interesting queries can be run, such as the following which calculates the frequency of each HTTP status code returned by the API. Additional queries I ran on this log file are available in a saints_xctf_api_queries.spl file on GitHub.
One of the great advantages of a centralized logging tool is the ability to create visualizations for data, potentially from multiple different sources and applications. In Splunk, one or more visualizations can be combined together into a dashboard. Dashboards provide a great way to analyze application logs quickly, and help business analysts with limited or no programming background to gain insight into the state of the processes they manage.
Splunk Enterprise has a "Dashboard" tab from which dashboards are created, edited, and viewed. I usually create dashboards using the newer "Dashboard Studio" format, which implements each dashboard with JSON code. Dashboards are mainly edited using no-code tools in the UI, but can also be manually modified by changing their JSON configurations. Usually, highly customized dashboards require a combination of both approaches.
The following dashboard, saved in a splunk_internals.json file, uses two SPL queries to display information about Splunk memory utilization and previously run queries.
Using JSON files, dashboards can be passed from one Splunk environment to another. For example, using the splunk_internals.json file, my dashboard can be easily replicated. To do this, first create a new dashboard on Splunk Enterprise, and then paste the JSON code into the "Source" section of the dashboard (accessed via the button outlined in red).
There is a lot more to show with dashboards, such as how to add custom styling in JSON code. I will try to cover that in a future article, but for newcomers to Splunk, it’s important to begin by leveraging the basic no-code capabilities of Splunk dashboards to meet business needs.
Splunk is a great way to centralize and perform analysis on logs from multiple sources. Creating dashboards to display commonly used queries is a great way to provide engineers and business analysts with insights into data and application state. All the code shown in this article is available on GitHub.