less_retarded_wiki/data_structure.md

39 lines
4 KiB
Markdown
Raw Normal View History

# Data Structure
2024-01-14 23:22:09 +01:00
*Not to be confused with [data type](data_type.md).*
Data structure refers to a any specific way in which [data](data.md) is organized in computer memory, which often comes with associated efficient operations on such data. A specific data structure describes such things as order, relationships (interconnection, hierarchy, ...), helper values ([checksum](checksum.md), [indices](index.md), ...), formats and [types](data_type.md) of parts of the data. [Programming](programming.md) is sometimes seen as consisting mainly of two things: design of [algorithms](algorithm.md) and data structures these algorithm work with.
As a programmer dealing with a specific problem you oftentimes have a choice of multiple data structures -- choosing the right one is essential for performance and efficiency of your program. As with everything, each data structure has advantages and also its downsides; some are faster, some take less memory etc. For example for a searchable database of text string we can be choosing between a [binary tree](binary_tree.md) and a [hash table](hash_table.md); hash table offers theoretically much faster search, but binary trees may be more memory efficient and offer many other efficient operations like range search and sorting (which hash tables can do but very inefficiently).
2024-01-14 23:22:09 +01:00
**What's the difference between data structure and (a potentially structured/complex) [data type](data_type.md)?** This can be tricky, in some specific cases the terms may even be interchanged without committing an error, but there is an important difference -- data structure is a PHYSICAL ORGANIZATION of data and though it's often associated with operations and algorithms (e.g. a binary tree comes with a natural search algorithm), the stress is on the layout of data in memory; on the other hand data type can be seen as a more abstract term defined by a SET OF ALLOWED VALUES and OPERATIONS on those values, usually without paying much attention to how those values and operations internally work, although in practice of course we rarely ignore this and often talk about a data type as being connected to specific data structure, which may be where the confusion comes from (also `struct` is a name of a data type in some languages, something potentially confusing as well). For example an ASCII text string is a data type, its set of values are all possible sequences of ASCII symbols and operations it allows are e.g. concatenation, substring search, substring replacement etc. This specific data type can be internally implemented differently, though one of the most natural ways is a "zero terminated string", i.e. [array](array.md) of values that always ends with value zero -- this is A DATA STRUCTURE. Because string, a data type, and zero terminated string (an array of values) are so closely connected, we may sometimes hear a *string* being called both a data type and data structure. However consider another example: a [dictionary](dictionary.md) -- this is a DATA TYPE, very frequently used e.g. in [Python](python.md), which allows storage of pairs of values; again dictionary itself is a data type defining only "how it behaves on the outside", but it can be implemented in several ways, for example with [trees](tree.md), [hash tables](hash_table.md) or [arrays](array.md), i.e. different DATA STRUCTURES. Different Python implementations will all offer the same dictionary data type but may use a different underlying data structure for it.
## Specific Data Structures
These are just some common ones:
- [array](array.md)
- [binary_tree](binary_tree.md)
- [bitfield](bitfield.md)
- [blockchain](blockchain.md)
- [B+ tree](bplus_tree.md)
- [circular buffer](circular_bugger.md)
- [directed acyclic graph](dac.md)
- [graph](graph.md)
- [hash table](hash_table.md)
- [heap](heap.md)
- [linked list](linked_list.md)
- [N-ary tree](nary_tree.md)
2024-01-15 15:29:30 +01:00
- pascal [string](string.md)
- [record](record.md)
- [stack](stack.md)
2024-01-14 23:22:09 +01:00
- zero terminated [string](string.md)
- [tree](tree.md)
2024-01-14 23:22:09 +01:00
- [tuple](tuple.md)
- [queue](queue.md)
2024-01-14 23:22:09 +01:00
- ...
## See Also
- [data](data.md)
- [data type](data_type.md)