{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# Data\n", "\n", "The `geoh5` format allows storing data (values) on different parts of an ``Object``. The data types currently supported by `geoh5py` are\n", "\n", "- Float\n", "- Integer\n", "- Text\n", "- Colormap\n", "- Well log\n", "\n", "![data](./images/data.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "1", "metadata": {}, "outputs": [], "source": [ "from geoh5py.workspace import Workspace\n", "import numpy as np\n", "\n", "# Re-use the previous workspace\n", "workspace = Workspace(\"my_project.geoh5\")\n", "\n", "# Get the curve from previous section\n", "curve = workspace.get_entity(\"Curve\")[0]" ] }, { "cell_type": "markdown", "id": "2", "metadata": {}, "source": [ "## Float\n", "\n", "Numerical `float` data can be attached to the various elements making up object. Data can be added to an `Object` entity using the `add_data` method." ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "curve.add_data({\n", " \"my_cell_values\": {\n", " \"association\":\"CELL\", \n", " \"values\": np.random.randn(curve.n_cells)\n", " }\n", "})" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "The `association` can be one of:\n", "\n", "- OBJECT: Single element characterizing the parent object\n", "- VERTEX: Array of values associated with the parent object vertices\n", "- CELL: Array of values associated with the parent object cells \n", "\n", "The length and order of the array of values must be consistent with the corresponding element of `association`. If the `association` argument is omited, `geoh5py` will attempt to assign the data to the correct part based on the shape of the data values, either `object.n_values` or `object.n_cells`" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "# Add multiple data vectors on a single call\n", "data = {}\n", "for ii in range(8):\n", " data[f\"Period:{ii}\"] = {\n", " \"association\":\"VERTEX\", \n", " \"values\": (ii+1) * np.cos(ii*curve.vertices[:, 0]*np.pi/curve.vertices[:, 0].max()/4.)\n", " }\n", "\n", "data_list = curve.add_data(data)\n", "print([obj.name for obj in data_list])" ] }, { "cell_type": "markdown", "id": "6", "metadata": {}, "source": [ "The newly created data is directly added to the project's `geoh5` file and available for visualization:\n", "\n", "![adddata](./images/adddata.png)" ] }, { "cell_type": "markdown", "id": "7", "metadata": {}, "source": [ "## Integer\n", "\n", "Same implementation as for [Float](#Float) data type but with values provided as integer (`int32`)." ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "## Text\n", "\n", "Text (string) data can only be associated to the object itself." ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "curve.add_data({\n", " \"my_comment\": {\n", " \"association\":\"OBJECT\", \n", " \"values\": \"hello_world\"\n", " }\n", "})" ] }, { "cell_type": "markdown", "id": "10", "metadata": {}, "source": [ "## Colormap\n", "\n", "The colormap data type can be used to store or customize the color palette used by Geoscience ANALYST." ] }, { "cell_type": "code", "execution_count": null, "id": "11", "metadata": {}, "outputs": [], "source": [ "from geoh5py.data.color_map import ColorMap\n", "\n", "# Create some data on a grid2D entity.\n", "grid = workspace.get_entity(\"Grid2D\")[0]\n", "\n", "# Add data\n", "radius = grid.add_data({\n", " \"radial\": {\"values\": np.linalg.norm(grid.centroids, axis=1)}\n", "})" ] }, { "cell_type": "markdown", "id": "12", "metadata": {}, "source": [ "![colormap](./images/default_colormap.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "13", "metadata": {}, "outputs": [], "source": [ "# Create a simple colormap that spans the data range\n", "nc = 10\n", "rgba = np.vstack([\n", " np.linspace(radius.values.min(), radius.values.max(), nc), # Values\n", " np.linspace(0, 255, nc), # Red\n", " np.linspace(255, 0, nc), # Green\n", " np.linspace(125, 15, nc), # Blue,\n", " np.ones(nc) * 255, # Alpha,\n", "]).T " ] }, { "cell_type": "markdown", "id": "14", "metadata": {}, "source": [ "We now have an array that contains a range of integer values for red, green, blue and alpha (RGBA) over the span of the data values. This array can be used to implicitly create a [ColorMap](../api/geoh5py.data.rst#module-geoh5py.data.color_map) from the `EntityType`." ] }, { "cell_type": "code", "execution_count": null, "id": "15", "metadata": {}, "outputs": [], "source": [ "# Assign the colormap to the data type\n", "radius.entity_type.color_map = rgba" ] }, { "cell_type": "markdown", "id": "16", "metadata": {}, "source": [ "The resulting `ColorMap` stores the values to `geoh5` as a `numpy.recarray` with fields for `Value`, `Red`, `Green`, `Blue` and `Alpha`." ] }, { "cell_type": "code", "execution_count": null, "id": "17", "metadata": {}, "outputs": [], "source": [ "radius.entity_type.color_map._values" ] }, { "cell_type": "markdown", "id": "18", "metadata": {}, "source": [ "![colormap](./images/custom_colormap.png)" ] }, { "cell_type": "markdown", "id": "19", "metadata": {}, "source": [ "## Files\n", "\n", "Raw files can be added to groups and objects and stored as blob (bytes) data in `geoh5`." ] }, { "cell_type": "code", "execution_count": null, "id": "20", "metadata": {}, "outputs": [], "source": [ "file_data = grid.add_file(\"./c_data.ipynb\")" ] }, { "cell_type": "markdown", "id": "21", "metadata": {}, "source": [ "![filename](./images/filename_data.png)\n", "\n", "The information can easily be re-exported out to disk with the `save` method." ] }, { "cell_type": "code", "execution_count": null, "id": "22", "metadata": {}, "outputs": [], "source": [ "file_data.save_file(path=\"./temp\", name=\"new_name.ipynb\")" ] }, { "cell_type": "code", "execution_count": null, "id": "23", "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "import shutil\n", "shutil.rmtree(\"./temp\")" ] }, { "cell_type": "markdown", "id": "24", "metadata": {}, "source": [ "## Well Data\n", "\n", "In the case of `Drillhole` objects, data are always stored as `from-to` interval values. \n", "\n", "### Depth Data\n", "\n", "Depth data are used to represent measurements recorded at discrete depths along the well path. A `depth` attribute is required on creation. Depth markers are converted internally to `from-to` intervals by adding a small depth values defined by the `collocation_distance`. If the `Drillhole` object already holds depth data at the same location, `geoh5py` will group the datasets under the same `PropertyGroup`. " ] }, { "cell_type": "code", "execution_count": null, "id": "25", "metadata": {}, "outputs": [], "source": [ "well = workspace.get_entity(\"Drillhole\")[0]\n", "depths_A = np.arange(0, 50.) # First list of depth\n", "\n", "# Second list slightly offsetted on the first few depths\n", "depths_B = np.arange(0.01, 50.01) \n", "\n", "# Add both set of log data with 0.5 m tolerance\n", "well.add_data({\n", " \"my_log_values\": {\n", " \"depth\": depths_A,\n", " \"values\": np.random.randn(depths_A.shape[0]),\n", " },\n", " \"log_wt_tolerance\": {\n", " \"depth\": depths_B,\n", " \"values\": np.random.randn(depths_B.shape[0]),\n", " }\n", "})" ] }, { "cell_type": "markdown", "id": "26", "metadata": {}, "source": [ "![DHlog](./images/DHlog.png){width=\"50%\"}" ] }, { "cell_type": "markdown", "id": "27", "metadata": {}, "source": [ "### Interval (From-To) Data\n", "\n", "Interval data are defined by constant values bounded by a start (FROM) and an end (TO) depth. A `from-to` attribute defined as a `numpy.ndarray (nD, 2)` is expected on creation. Subsequent data are appended to the same interval `PropertyGroup` if the `from-to` values match within the collocation distance parameter. Users can control the tolerance for matching intervals by supplying a `collocation_distance` argument in meters, or by setting the default on the drillhole entity (`default_collocation_distance = 1e-2` meters)." ] }, { "cell_type": "code", "execution_count": null, "id": "28", "metadata": {}, "outputs": [], "source": [ "# Define a from-to array \n", "from_to = np.vstack([\n", " [0.25, 25.5],\n", " [30.1, 55.5],\n", " [56.5, 80.2]\n", "])\n", "\n", "# Add some reference data\n", "well.add_data({\n", " \"interval_values\": {\n", " \"values\": np.asarray([1, 2, 3]), \n", " \"from-to\": from_to,\n", " \"value_map\": {\n", " 1: \"Unit_A\",\n", " 2: \"Unit_B\",\n", " 3: \"Unit_C\"\n", " },\n", " \"type\": \"referenced\",\n", " }\n", "})\n", "\n", "# Add float data on the same intervals\n", "well.add_data({\n", " \"random_values\": {\n", " \"values\": np.random.randn(from_to.shape[0]), \n", " \"from-to\": from_to,\n", " }\n", "})" ] }, { "cell_type": "markdown", "id": "29", "metadata": {}, "source": [ "![DHinterval](./images/DHinterval.png){width=\"50%\"}" ] }, { "cell_type": "markdown", "id": "30", "metadata": {}, "source": [ "## Get data\n", "Just like any `Entity`, data can be retrieved from the `Workspace` using the `get_entity` method. For convenience, `Objects` also have a `get_data_list` and `get_data` method that focusses only on their respective children `Data`." ] }, { "cell_type": "code", "execution_count": null, "id": "31", "metadata": {}, "outputs": [], "source": [ "my_list = curve.get_data_list()\n", "print(my_list, curve.get_data(my_list[0]))" ] }, { "cell_type": "markdown", "id": "32", "metadata": {}, "source": [ "# Property Groups\n", "\n", "`Data` entities sharing the same parent `Object` and `association` can be linked within a `property_groups` and made available through profiling. This can be used to group data that would normally be stored as 2D array." ] }, { "cell_type": "code", "execution_count": null, "id": "33", "metadata": {}, "outputs": [], "source": [ "# Add another VERTEX data and create a group with previous\n", "curve.add_data_to_group([obj.uid for obj in data_list], \"my_trig_group\")\n" ] }, { "cell_type": "markdown", "id": "34", "metadata": {}, "source": [ "![propgroups](./images/propgroups.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "35", "metadata": {}, "outputs": [], "source": [ "workspace.close()" ] } ], "metadata": { "celltoolbar": "Edit Metadata", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" } }, "nbformat": 4, "nbformat_minor": 5 }