千家信息网

pig的基本操作介绍

发表于:2025-01-25 作者:千家信息网编辑
千家信息网最后更新 2025年01月25日,本篇内容介绍了"pig的基本操作介绍"的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!pig是什么?我的
千家信息网最后更新 2025年01月25日pig的基本操作介绍

本篇内容介绍了"pig的基本操作介绍"的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!

pig是什么?

我的理解是: pig就相当于 shell , hadoop就相当于linux (所以我尽可能的会使用pig操作hadoop的文件)

1.进入HADOOP_HOME目录。
2.执行sh bin/hadoop
我们可以看到更多命令的说明信息:
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
namenode -format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
datanode run a DFS datanode
dfsadmin run a DFS admin client
fsck run a DFS filesystem checking utility
fs run a generic filesystem user client
balancer run a cluster balancing utility
jobtracker run the MapReduce job Tracker node
pipes run a Pipes job
tasktracker run a MapReduce task Tracker node
job manipulate MapReduce jobs
queue get information regarding JobQueues
version print the version
jar run a jar file
distcp copy file or directories recursively
archive -archiveName NAME * create a hadoop archive
daemonlog get/set the log level for each daemon
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

常用pig命令

ls/ pwd/ cd

例如: 查看文件大小

grunt> fs -du -h -s 文件名
19.4 G 文件名

grunt> help

Commands:
; - See the PigLatin manual for details: http://hadoop.apache.org/pig
File system commands:
fs - Equivalent to Hadoop dfs command: http://hadoop.apache.org/common/docs/current/hdfs_shell.html
Diagnostic commands:
describe [::explain [-script ] [-out ] [-brief] [-dot|-xml] [-param =]
[-param_file ] [] - Show the execution plan to compute the alias or for entire script.
-script - Explain the entire script.
-out - Store the output into directory rather than print to stdout.
-brief - Don't expand nested plans (presenting a smaller graph for overview).
-dot - Generate the output in .dot format. Default is text format.
-xml - Generate the output in .xml format. Default is text format.
-param -param_file - See parameter substitution for details.
alias - Alias to explain.
dump - Compute the alias and writes the results to stdout.
Utility Commands:
exec [-param =param_value] [-param_file ]

0