As you might know, I have discovered Ansible 1 year ago. Since then, I am not using for its main purpose as wikipedia says:
Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code.
Instead, I develop few playbooks for others objectives.
I will present you two of them:
- Search in logs
- Count clients connections
Search in Logs
In one customer, they have an application deployed over 12 nodes cluster. For day to day operation, I receive user tickets with an error message, but without knowing on which server they were connected while facing the problem (There is a load balancer in front of clients). Unfortunately, Centralized Log Management is not ready, thus I had to think to another solution. Here is where Ansible could help.
The advantage of Ansible over bash scripting, in this situation, is that all users credentials are already managed by the Ansible environment that was developed:
1 2 3 4 5 6 | - name: Include common tasks tags: [ always ] include_role: name: common apply : tags: always |
These transparently manages:
- Service user name and associated credentials.
- Access to server with admin account (login and password).
What are we Looking for?
Let’s focus on main feature: The search. To do that, first thing is to know what we are looking for:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | - name: Prompt pattern block: - name: Prompt pattern register: pattern_input ansible.builtin.pause: prompt: | Enter searched pattern - name: Set pattern fact set_fact: pattern: "{{ pattern_input.user_input }}" when: pattern is not defined delegate_to: localhost run_once: True |
What I did is to interactively request the pattern we are looking for if it was not already provided as playbook parameter (ie. not defined). If the pattern is already set, the block of code will not be executed.
To avoid requesting same pattern as many times as there are servers in the cluster, this will be run only once (line 14) and it will not be related to a specific host of the inventory, so I kept it local (line 13).
Searching
Now, I am ready to do the actual search:
1 2 3 4 5 6 7 8 9 10 11 | - name: Search {{ pattern }} in log find: paths: / opt / , / u02 / app / weblogic / config / domains / {{ weblogic_domain }} / servers / file_type: file exclude: '*.gz' contains: '.*{{ pattern }}.*' recurse: true age: - 5d register: findings become: true become_user: weblogic |
I am using the find module with a regex. This regex requires “.*”, meaning any characters any amount of time, to be added at beginning and end. If not, it will search only files contain exactly the pattern. Not more, not less. Result will be stored (ie. registered) in findings variable. Note that I searched for files not older than 5 days (line 8) and exclude archived logs (line 5) for faster results.
Then, my idea was to provide a list of files containing the pattern:
1 2 3 | - name: output the path of the files set_fact: path: "{{ findings.files | map(attribute='path') | join('\n - ') }}" |
The path variable will be temporary and written to a file that is local to the Ansible controller server.
Finally, writing the file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | - name: Remove {{ ansible_limit }} file ansible.builtin. file : path: "{{ ansible_limit }}.out" state: absent delegate_to: localhost run_once: True - name: Copy list of files in {{ ansible_limit }} ansible.builtin.lineinfile: path: "{{ ansible_limit }}.out" line: "{{inventory_hostname}}:\n - {{ path }}" create: yes mode: 0666 delegate_to: localhost throttle: 1 |
In first task, I am removing the file and, secondly, I am writing result in file. At first, the results were unordered since they depended on the completion time of tasks on each individual node. To avoid that, I added a “throttle: 1” option which will ensure, it is run one task at a time. “order: sorted” is also added at the beginning of the playbook to ensure that.
Count Clients Connections
This second playbook is to get the amount of client connected to each servers to confirm they are correctly load balanced across all nodes.
First task is to get the process ID with a “shell” task:
1 2 3 | - name: Getting process ID shell: ps aux | grep '{{ pattern }}' | grep - v grep | tr - s ' ' | cut - d ' ' - f2 register: ps_output |
“pattern” is a string which will help to find the PID.
Then, I used netstat to find all established connection to that process (pid_string= “{{ ps_output.stdout }}/java”):
1 2 3 4 5 | - name: netstat shell: netstat - anpt 2 > / dev / null | grep '{{ pid_string }}' | grep ESTABLI | grep - v 1521 register: conn_list become: true become_user: weblogic |
I filtered out connection to Oracle Database (port 1521) as this process has connections to it as well.
The amount of lines of “conn_list” variable will be the amount of connections:
1 2 3 | - name: Set conn_count set_fact: conn_count: "{{ conn_list.stdout_lines | length }}" |
The same way a previous playbook, I am creating a file local to the Ansible controller where I write a line for each nodes with amount of connections:
1 2 3 4 5 6 7 | - name: Copy result in {{ result_file }} ansible.builtin.lineinfile: path: "{{ result_file }}" line: "{{inventory_hostname}};{{ pattern }};{{ pid_string }};{{ conn_count }} create: yes mode: 0666 throttle: 1 |
I have included the pattern used and the process ID of the host on each host. Please note that all tasks associated with the local file are grouped together and delegated to localhost.
Finally, I thought I could add a total of connections for all nodes. This was the difficult part. Initially, I used a sed of the file to do it, but then, I thought “There is nothing that Ansible can’t do!”. So I persevered and found that solution:
1 2 3 4 | - name: Calculate totals in {{ result_file }} set_fact: TotalConnLines: "{{ansible_play_hosts_all | map('extract', hostvars, 'conn_count') | map('int') | sum }}" run_once: True |
Let’s detail that jinja template:
- ansible_play_hosts_all
- map(‘extract’, hostvars, ‘conn_count’)
- map(‘int’)
- sum
Part 1 is to get a list of all hosts where Ansible is ran. Then, part 2, I extract from “hostvars“, the variable “conn_count” for each hosts. This is now a list of counts. I could simply pipe it to “sum”, but this failed because elements of the list are strings. So, I had to apply “int” on them with help of map (part 3). Finally, summing up counts at part 4.
Then, I write the total line to the resulting file:
1 2 3 4 5 6 | - name: Add total line in {{ result_file }} ansible.builtin.lineinfile: path: "{{ result_file }}" line: ";{{ TotalConnLines }}" insertbefore: EOF run_once: True |
This is quite a complex jinja template to do it, but we see nothing is impossible.
And Yours?
And you, for what are you using Ansible which was not his main purpose?