A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://www.alibabacloud.com/help/en/es/use-cases/data-collection-for-alibaba-cloud-elasticsearch below:

Data collection for Alibaba Cloud Elasticsearch - Elasticsearch

This topic describes the methods that are used to collect and send data from a variety of data sources to an Alibaba Cloud Elasticsearch cluster.

Background information

Elasticsearch is widely used for data search and analytics. Developers and communities use Elasticsearch in a wide range of scenarios. The scenarios include application search, website search, logging, infrastructure monitoring, application performance monitoring (APM), and security analytics. Solutions for these scenarios are provided free of charge. However, before developers use these solutions, they must import the required data into Elasticsearch.

Elasticsearch provides a flexible RESTful API to communicate with client applications. You can call this RESTful API to collect, search for, and analyze data. You can also use the API to manage Elasticsearch clusters and indexes on the clusters.

Elastic Beats

Elastic Beats consists of a set of lightweight data shippers that can transfer data to Elasticsearch. These shippers do not incur a number of runtime overheads. Beats can run and collect data on devices that do not have sufficient hardware resources. The devices include IoT devices, edge devices, or embedded devices. If you want to collect data but do not have sufficient resources to run a resource-intensive data shipper, we recommend that you use Beats. Based on data collected by Beats from all Internet-connected devices, you can quickly identify exceptions, such as system errors and security issues. Then, you can take measures to deal with these exceptions.

Beats can also be used in systems that have sufficient hardware resources.

You can use Beats to collect various types of data.

For more information about how to use Metricbeat, see Use self-managed Metricbeat to collect system metrics. Use other shippers in a similar way.

Logstash

Logstash is a powerful and flexible tool that is used to read, process, and transfer all types of data. Logstash provides a variety of features and has high requirements for device performance. Beats does not support some features provided by Logstash, or it is costly to use Beats for some features. For example, it is costly to use Beats to enrich documents by searching for data in external data sources. Logstash has higher requirements for hardware resources than Beats. Therefore, Logstash cannot be deployed on devices whose hardware resources cannot meet the minimum requirements. If Beats is not qualified for specific scenarios, use Logstash instead.

In most cases, Beats and Logstash work collaboratively. Specifically, use Beats to collect data and Logstash to process data.

Alibaba Cloud Elasticsearch integrates the Logstash service. Alibaba Cloud Logstash is a server-side data processing pipeline. It is compatible with all the capabilities of open source Logstash. Alibaba Cloud Logstash can be used to dynamically collect data from multiple data sources at the same time and transform and store collected data to a specified location. Alibaba Cloud Logstash can be used to process and transform all types of events by using input, filter, and output plug-ins.

Logstash data processing pipelines are used to run tasks. Each pipeline consists of at least one input plug-in, one filter plug-in, and one output plug-in.

The following section describes a sample Logstash pipeline. It can be used to complete the following operations:

  1. Configure a Logstash pipeline.

    input { 
      rss { 
        url => "/blog/feed" 
        interval => 120 
      } 
    } 
    filter { 
      mutate { 
        rename => [ "message", "blog_html" ] 
        copy => { "blog_html" => "blog_text" } 
        copy => { "published" => "@timestamp" } 
      } 
      mutate { 
        gsub => [  
          "blog_text", "<.*?>", "",
          "blog_text", "[\n\t]", " " 
        ] 
        remove_field => [ "published", "author" ] 
      } 
    } 
    output { 
      stdout { 
        codec => dots 
      } 
      elasticsearch { 
        hosts => [ "https://<your-elsaticsearch-url>" ] 
        index => "elastic_blog" 
        user => "elastic" 
        password => "<your-elasticsearch-password>" 
      } 
    }

    Set hosts to a value in the format of <Internal endpoint of your Elasticsearch cluster>:9200. Set password to the password that is used to access the Elasticsearch cluster.

  2. In the Kibana console, view the migrated index data.

    POST elastic_blog/_search

    For more information, see Step 3: View synchronization results.

Clients

You can use Elasticsearch clients to integrate data collection code with tailored application code. These clients are libraries that abstract low-level details of the data collection. They allow you to focus on specific operations that are related to your application. Elasticsearch supports multiple programming languages for clients, such as Java, JavaScript, Go, .NET, PHP, Perl, Python, and Ruby. For more information about the programming languages and the details and sample code of your selected language, see Elasticsearch Clients.

If the programming language of your application is not included in the preceding supported languages, obtain the required information from Community Contributed Clients.

Kibana

We recommend that you use the Kibana console to develop and debug Elasticsearch requests. Kibana provides all features of the RESTful API in Elasticsearch and abstracts the technical details of underlying HTTP requests. You can use Kibana to add original JSON documents to an Elasticsearch cluster.

PUT my_first_index/_doc/1 
{ 
    "title" :"How to Ingest Into Elasticsearch Service",
    "date" :"2019-08-15T14:12:12",
    "description" :"This is an overview article about the various ways to ingest into Elasticsearch Service" 
}
Note

In addition to Kibana, you can use other tools to communicate with Elasticsearch and collect documents by calling the RESTful API. For example, you can use cURL to develop and debug Elasticsearch requests or integrate tailored scripts.

Summary

Multiple methods are provided to collect and send data from a variety of data sources to Elasticsearch. You must select the most suitable method based on your business scenarios, requirements, and operating systems.

References

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4