千家信息网

clickhouse批量插入数据及ClickHouse常用命令的示例分析

发表于:2024-11-26 作者:千家信息网编辑
千家信息网最后更新 2024年11月26日,小编给大家分享一下clickhouse批量插入数据及ClickHouse常用命令的示例分析,相信大部分人都还不怎么了解,因此分享这篇文章给大家参考一下,希望大家阅读完这篇文章后大有收获,下面让我们一起
千家信息网最后更新 2024年11月26日clickhouse批量插入数据及ClickHouse常用命令的示例分析

小编给大家分享一下clickhouse批量插入数据及ClickHouse常用命令的示例分析,相信大部分人都还不怎么了解,因此分享这篇文章给大家参考一下,希望大家阅读完这篇文章后大有收获,下面让我们一起去了解一下吧!

一.安装使用

ClickHouse是Yandex提供的一个开源的列式存储数据库管理系统,多用于联机分析(OLAP)场景,可提供海量数据的存储和分析,同时利用其数据压缩和向量化引擎的特性,能提供快速的数据搜索。

Ⅰ).安装

sudo yum install yum-utilssudo rpm --import https://repo.yandex.ru/clickhouse/CLICKHOUSE-KEY.GPGsudo yum-config-manager --add-repo https://repo.yandex.ru/clickhouse/rpm/stable/x86_64sudo yum install clickhouse-server clickhouse-clientsudo /etc/init.d/clickhouse-server startclickhouse-client

Ⅱ).配置

a).clickhouse-server

CLICKHOUSE_USER=username CLICKHOUSE_LOGDIR=${CLICKHOUSE_HOME}/log/clickhoue-serverCLICKHOUSE_LOGDIR_USER=usernameCLICKHOUSE_DATADIR_OLD=${CLICKHOUSE_HOME}/data/oldCLICKHOUSE_DATADIR=${CLICKHOUSE_HOME}/data

b).config.xml

... ...        info    ${CLICKHOUSE_HOME}/log/clickhoue-server/clickhoue-server.log    ${CLICKHOUSE_HOME}/log/clickhoue-server/clickhoue-server-error.log    100M    5       ${CLICKHOUSE_HOME}  ${CLICKHOUSE_HOME}/tmp  ${CLICKHOUSE_HOME}/user_files     ::     Asiz/Shanghai... ...

Ⅲ).启停服务

#### a).启动服务sudo service clickhouse-server start#### b).停止服务sudo service clickhouse-server stop

Ⅳ).客户端访问

clickhouse-client

二.常用命令

Ⅰ).创建表

CREATE TABLE IF NOT EXISTS database.table_name ON cluster cluster_shardNum_replicasNum(    'id' UInt64,    'name' String,    'time' UInt64,    'age' UInt8,    'flag' UInt8)ENGINE = MergeTreePARTITION BY toDate(time/1000)ORDER BY (id,name)SETTINGS index_granularity = 8192

Ⅱ).创建物化视图

CREATE MATERIALIZED VIEW database.view_name ON cluster cluster_shardNum_replicasNumENGINE = AggregatingMergeTreePARTITION BY toYYYYMMDD(time)ORDER BY (id,name)AS SELECT     toStartOfHour(toDateTime(time/1000)) as time,    id,    name,    sumState( if (flag = 1, 1, 0)) AS successCount,    sumState( if (flag = 0, 1, 0)) AS faildCount,    sumState( if ((age < 10), 1, 0)) AS rang1Age,    sumState( if ((age > 10) AND (age < 20), 2, 0)) AS rang2Age,    sumState( if ((age > 20), 3, 0)) AS rang3Age,    maxState(age) AS maxAge,    minState(age) AS minAgeFROM datasource.table_nameGROUP BY time,id,name

Ⅲ).插入数据

a).普通数据插入

INSERT INTO database.table_name(id, name, age, flag) VALUES(1, 'test', 15, 0)

b).Json数据插入

INSERT INTO database.table_name FORMAT JSONEachRow{"id":"1", "name":"test", "age":"11", "flag":"1"}

Ⅳ).查询数据

a).表数据查询

SELECT * FROM database.table_name WHERE id=1

b).物化视图查询

SELECT id, name, sumMerge(successCount), sumMerge(faildCount), sumMerge(rang1Age), sumMerge(rang2Age), maxMerge(maxAge), minMerge(minAge) FROM database.view_name WHERE id=1GROUP BY id, name

Ⅴ).创建NESTED表

CREATE TABLE IF NOT EXISTS database.table_name(  'id' UInt64,  'name' String,  'time' UInt64,  'age' UInt8,  'flag' UInt8nested_table_name Nested (  sequence UInt32,  id UInt64,  name String,  time UInt64,  age UInt8,  flag UInt8  socketAddr String,  socketRemotePort UInt32,  socketLocalPort UInt32,  eventTime UInt64,  exceptionClassName String,  hashCode Int32,  nextSpanId UInt64))ENGINE = MergeTreePARTITION BY toDate (time / 1000)ORDER BY (id, name, time)SETTINGS index_granularity = 8192

Ⅵ).NESTED表数据查询

SELECT table1.*,table1.id FROM nest.table_name AS table1 array JOIN nested_table_name AS table2

Ⅶ).配置字典项

      url                    hostname          9000          default                    dict          url_dict
30 36 id hash_code String url String
url_hash hostname 9000 default dict url_hash
30 36 hash_code String url String

Ⅷ).字典查询

SELECT    id,    dictGet('name', 'name', toUInt64(name)) AS name,    dictGetString('url', 'url', tuple(url)) AS urlFROM table_name

Ⅸ).导入数据

clickhouse-client --query="INSERT INTO database.table_name FORMAT CSVWithNames" < /path/import_filename.csv

Ⅹ).导出数据

clickhouse-client --query="SELECT * FROM database.table_name FORMAT CSV" sed 's/"//g' > /path/export_filename.csv

Ⅺ).查看partition状态

SELECT table, name, partition,active FROM system.parts WHERE database='database_name'

Ⅻ).清理partition

ALTER TABLE database.table_name ON cluster cluster_shardNum_replicasNum detach partition 'partition_id'

XIII).查看列的压缩率

SELECT    database,    table,    name,    formatReadableSize(sum(data_compressed_bytes) AS c) AS comp,    formatReadableSize(sum(data_uncompressed_bytes) AS r) AS raw,    c/r AS comp_ratioFROM system.columnsWHERE database='database_name'    AND table='table_name'GROUP BY name

XIV).查看物化视图的磁盘占用

clickhouse-client --query="SELECT partition,count(*) AS partition_num, formatReadableSize(sum(bytes)) AS disk_size FROM system.columns WHERE database='database_name' " --external --?le=***.sql --name=parts --structure='table String, name String, partition UInt64, engine String' -h hostname

以上是"clickhouse批量插入数据及ClickHouse常用命令的示例分析"这篇文章的所有内容,感谢各位的阅读!相信大家都有了一定的了解,希望分享的内容对大家有所帮助,如果还想学习更多知识,欢迎关注行业资讯频道!

0