HiveCluster
Kind
hive.stackable.tech
Group
v1alpha1
Version

apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
spec object

A Hive cluster stacklet. This resource is managed by the Stackable operator for Apache Hive. Find more information on how to use it and the resources that the operator generates in the operator documentation.


clusterConfig object required

Hive metastore settings that affect all roles and role groups. The settings in the clusterConfig are cluster wide settings that do not need to be configurable at role or role group level.


authentication object

Settings related to user authentication.


kerberos object required

Kerberos configuration.


secretClass string required

Name of the SecretClass providing the keytab for the HBase services.

database object required

Database connection specification for the metadata database.


connString string required

A connection string for the database. For example: jdbc:postgresql://hivehdfs-postgresql:5432/hivehdfs

credentialsSecret string required

A reference to a Secret containing the database credentials. The Secret needs to contain the keys username and password.

dbType string: enum required
Enum variants: derbymysqlpostgresoraclemssql

The type of database to connect to. Supported are: postgres, mysql, oracle, mssql and derby. This value is used to configure the jdbc driver class.

hdfs object

HDFS connection specification.


configMap string required

Name of the discovery ConfigMap providing information about the HDFS cluster. See also the Stackable Operator for HDFS to learn more about setting up an HDFS cluster.

listenerClass string: enum
Enum variants: cluster-internalexternal-unstableexternal-stable

This field controls which type of Service the Operator creates for this HiveCluster:

  • cluster-internal: Use a ClusterIP service

  • external-unstable: Use a NodePort service

  • external-stable: Use a LoadBalancer service

This is a temporary solution with the goal to keep yaml manifests forward compatible. In the future, this setting will control which ListenerClass will be used to expose the service, and ListenerClass names will stay the same, allowing for a non-breaking change.

s3 object

S3 connection specification. This can be either inline or a reference to an S3Connection object. Read the S3 concept documentation to learn more.


inline object

S3 connection definition as a resource. Learn more on the S3 concept documentation.


accessStyle string: enum
Enum variants: PathVirtualHosted

Which access style to use. Defaults to virtual hosted-style as most of the data products out there. Have a look at the AWS documentation.

credentials object

If the S3 uses authentication you have to specify you S3 credentials. In the most cases a SecretClass providing accessKey and secretKey is sufficient.


scope object

listenerVolumes []string

The listener volume scope allows Node and Service scopes to be inferred from the applicable listeners. This must correspond to Volume names in the Pod that mount Listeners.

node boolean

The node scope is resolved to the name of the Kubernetes Node object that the Pod is running on. This will typically be the DNS name of the node.

pod boolean

The pod scope is resolved to the name of the Kubernetes Pod. This allows the secret to differentiate between StatefulSet replicas.

services []string

The service scope allows Pod objects to specify custom scopes. This should typically correspond to Service objects that the Pod participates in.

secretClass string required

SecretClass containing the LDAP bind credentials.

host string required

Host of the S3 server without any protocol or port. For example: west1.my-cloud.com.

port integer

Port the S3 server listens on. If not specified the product will determine the port to use.

tls object

Use a TLS connection. If not specified no TLS will be used.


verification object required

The verification method used to verify the certificates of the server and/or the client.


none object

Use TLS but don't verify certificates.

server object

Use TLS and a CA certificate to verify the server.


caCert object required

CA cert to verify the server.


secretClass string

Name of the SecretClass which will provide the CA certificate. Note that a SecretClass does not need to have a key but can also work with just a CA certificate, so if you got provided with a CA cert but don't have access to the key you can still use this method.

webPki object

Use TLS and the CA certificates trusted by the common web browsers to verify the server. This can be useful when you e.g. use public AWS S3 or other public available services.

reference string

No Description Provided.

vectorAggregatorConfigMapName string

Name of the Vector aggregator discovery ConfigMap. It must contain the key ADDRESS with the address of the Vector aggregator. Follow the logging tutorial to learn how to configure log aggregation with Vector.

clusterOperation object

Cluster operations properties, allow stopping the product instance as well as pausing reconciliation.


reconciliationPaused boolean

Flag to stop cluster reconciliation by the operator. This means that all changes in the custom resource spec are ignored until this flag is set to false or removed. The operator will however still watch the deployed resources at the time and update the custom resource status field. If applied at the same time with stopped, reconciliationPaused will take precedence over stopped and stop the reconciliation immediately.

stopped boolean

Flag to stop the cluster. This means all deployed resources (e.g. Services, StatefulSets, ConfigMaps) are kept but all deployed Pods (e.g. replicas from a StatefulSet) are scaled to 0 and therefore stopped and removed. If applied at the same time with reconciliationPaused, the latter will pause reconciliation and stopped will take no effect until reconciliationPaused is set to false or removed.

image object required

Specify which image to use, the easiest way is to only configure the productVersion. You can also configure a custom image registry to pull from, as well as completely custom images.

Consult the Product image selection documentation for details.


custom string

Overwrite the docker image. Specify the full docker image name, e.g. docker.stackable.tech/stackable/superset:1.4.1-stackable2.1.0

productVersion string

Version of the product, e.g. 1.4.1.

pullPolicy string: enum
Enum variants: IfNotPresentAlwaysNever

Pull policy used when pulling the image.

pullSecrets []object

Image pull secrets to pull images from a private registry.


name string required

Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

repo string

Name of the docker repo, e.g. docker.stackable.tech/stackable

stackableVersion string

Stackable version of the product, e.g. 23.4, 23.4.1 or 0.0.0-dev. If not specified, the operator will use its own version, e.g. 23.4.1. When using a nightly operator or a pr version, it will use the nightly 0.0.0-dev image.

metastore object

This struct represents a role - e.g. HDFS datanodes or Trino workers. It has a key-value-map containing all the roleGroups that are part of this role. Additionally, there is a config, which is configurable at the role and roleGroup level. Everything at roleGroup level is merged on top of what is configured on role level. There is also a second form of config, which can only be configured at role level, the roleConfig. You can learn more about this in the Roles and role group concept documentation.


cliOverrides object

No Description Provided.

config object

No Description Provided.


affinity object

These configuration settings control Pod placement.


nodeAffinity object

Same as the spec.affinity.nodeAffinity field on the Pod, see the Kubernetes docs

nodeSelector object

Simple key-value pairs forming a nodeSelector, see the Kubernetes docs

podAffinity object

Same as the spec.affinity.podAffinity field on the Pod, see the Kubernetes docs

podAntiAffinity object

Same as the spec.affinity.podAntiAffinity field on the Pod, see the Kubernetes docs

gracefulShutdownTimeout string

Time period Pods have to gracefully shut down, e.g. 30m, 1h or 2d. Consult the operator documentation for details.

logging object

Logging configuration, learn more in the logging concept documentation.


containers object

Log configuration per container.

enableVectorAgent boolean

Wether or not to deploy a container with the Vector log agent.

resources object

Resource usage is configured here, this includes CPU usage, memory usage and disk storage usage, if this role needs any.


cpu object

No Description Provided.


max string

The maximum amount of CPU cores that can be requested by Pods. Equivalent to the limit for Pod resource configuration. Cores are specified either as a decimal point number or as milli units. For example:1.5 will be 1.5 cores, also written as 1500m.

min string

The minimal amount of CPU cores that Pods need to run. Equivalent to the request for Pod resource configuration. Cores are specified either as a decimal point number or as milli units. For example:1.5 will be 1.5 cores, also written as 1500m.

memory object

No Description Provided.


limit string

The maximum amount of memory that should be available to the Pod. Specified as a byte Quantity, which means these suffixes are supported: E, P, T, G, M, k. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki. For example, the following represent roughly the same value: 128974848, 129e6, 129M, 128974848000m, 123Mi

runtimeLimits object

Additional options that can be specified.

storage object

No Description Provided.


data object

This field is deprecated. It was never used by Hive and will be removed in a future CRD version. The controller will warn if it's set to a non zero value.


capacity string

Quantity is a fixed-point representation of a number. It provides convenient marshaling/unmarshaling in JSON and YAML, in addition to String() and AsInt64() accessors.

The serialization format is:


    (Note that  may be empty, from the "" case in .)

           ::= 0 | 1 | ... | 9           ::=  |            ::=  | . | . | .             ::= "+" | "-"     ::=  |            ::=  |  |          ::= Ki | Mi | Gi | Ti | Pi | Ei

    (International System of units; See: http://physics.nist.gov/cuu/Units/binary.html)

       ::= m | "" | k | M | G | T | P | E

    (Note that 1024 = 1Ki but 1000 = 1k; I didn't choose the capitalization.)

 ::= "e"  | "E"  ```

No matter which of the three exponent forms is used, no quantity may represent a number greater than 2^63-1 in magnitude, nor may it have more than 3 decimal places. Numbers larger or more precise will be capped or rounded up. (E.g.: 0.1m will rounded up to 1m.) This may be extended in the future if we require larger or smaller quantities.

When a Quantity is parsed from a string, it will remember the type of suffix it had, and will use the same type again when it is serialized.

Before serializing, Quantity will be put in "canonical form". This means that Exponent/suffix will be adjusted up or down (with a corresponding increase or decrease in Mantissa) such that:

- No precision is lost - No fractional digits will be emitted - The exponent (or suffix) is as large as possible.

The sign will be omitted unless the number is negative.

Examples:

- 1.5 will be serialized as "1500m" - 1.5Gi will be serialized as "1536Mi"

Note that the quantity will NEVER be internally represented by a floating point number. That is the whole point of this exercise.

Non-canonical values will still parse as long as they are well formed, but will be re-emitted in their canonical form. (So always use canonical form, or don't diff.)

This format is intended to make it difficult to use these numbers without writing some sort of special handling code in the hopes that that will cause implementors to also use a fixed point implementation.
selectors object

A label selector is a label query over a set of resources. The result of matchLabels and matchExpressions are ANDed. An empty label selector matches all objects. A null label selector matches no objects.


matchExpressions []object

matchExpressions is a list of label selector requirements. The requirements are ANDed.


key string required

key is the label key that the selector applies to.

operator string required

operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

values []string

values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.

matchLabels object

matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.

storageClass string

No Description Provided.

warehouseDir string

The location of default database for the Hive warehouse. Maps to the hive.metastore.warehouse.dir setting.

configOverrides object

The configOverrides can be used to configure properties in product config files that are not exposed in the CRD. Read the config overrides documentation and consult the operator specific usage guide documentation for details on the available config files and settings for the specific product.

envOverrides object

envOverrides configure environment variables to be set in the Pods. It is a map from strings to strings - environment variables and the value to set. Read the environment variable overrides documentation for more information and consult the operator specific usage guide to find out about the product specific environment variables that are available.

podOverrides object

In the podOverrides property you can define a PodTemplateSpec to override any property that can be set on a Kubernetes Pod. Read the Pod overrides documentation for more information.

roleConfig object

This is a product-agnostic RoleConfig, which is sufficient for most of the products.


podDisruptionBudget object

This struct is used to configure:

  1. If PodDisruptionBudgets are created by the operator 2. The allowed number of Pods to be unavailable (maxUnavailable)

Learn more in the allowed Pod disruptions documentation.


enabled boolean

Whether a PodDisruptionBudget should be written out for this role. Disabling this enables you to specify your own - custom - one. Defaults to true.

maxUnavailable integer

The number of Pods that are allowed to be down because of voluntary disruptions. If you don't explicitly set this, the operator will use a sane default based upon knowledge about the individual product.

roleGroups object required

No Description Provided.

status object

No Description Provided.


conditions []object

No Description Provided.


lastTransitionTime string

Last time the condition transitioned from one status to another.

lastUpdateTime string

The last time this condition was updated.

message string

A human readable message indicating details about the transition.

reason string

The reason for the condition's last transition.

status string: enum required
Enum variants: TrueFalseUnknown

Status of the condition, one of True, False, Unknown.

type string: enum required
Enum variants: AvailableDegradedProgressingReconciliationPausedStopped

Type of deployment condition.

discoveryHash string

An opaque value that changes every time a discovery detail does