Pular para o conteúdo

Conheça Walt Disney World

Array programming

In computer science, array programming languages (also known as vector or multidimensional languages) generalize operations on scalars to apply transparently to vectors, matrices, and higher dimensional arrays.

Array programming primitives concisely express broad ideas about data manipulation. The level of conciseness can be dramatic in certain cases: it is not uncommon to find array programming language one-liners that require more than a couple of pages of Java code. [1]

APL, designed by Ken Iverson, was the first programming language to provide array programming capabilities. The mnemonic APL refers to the title of his seminal book "A Programming Language" and not to arrays per se. Iverson's contribution to rigor and clarity was probably more important than the simple extension of dimensions to functions.

Contents

Concepts

The fundamental idea behind array programming is that operations apply at once to an entire set of values. This makes it a high-level programming model as it allows the programmer to think and operate on whole aggregates of data, without having to resort to explicit loops of individual scalar operations.

The basis behind array programming and thinking is to find and exploit the properties of data where individual elements are similar and/or adjacent. Unlike object orientation which implicitly breaks down data to its constituent parts (or scalar quantities), array orientation looks to group data and apply a uniform handling.

Function rank is an important concept to array programming languages in general, by analogy to tensor rank in mathematics: functions that operate on data may be classified by the number of dimensions they act on. Ordinary multiplication, for example, is a scalar ranked function because it operates on zero-dimensional data (individual numbers). The cross product operation is an example of a vector rank function because it operates on vectors, not scalars. Matrix multiplication is an example of a 2-rank function, because it operates on 2-dimensional objects (matrices). Collapse operators reduce the dimensionality of an input data array by one or more dimensions. For example, summing over elements collapses the input array by 1 dimension.

Uses

Array programming is very well suited to implicit parallelization; a topic of much research nowadays. Further, Intel and compatible CPUs developed and produced after 1997 contained various instruction set extensions, starting from MMX and continuing through SSSE3 and 3DNow!, which include rudimentary SIMD array capabilities. Array processing is distinct from parallel processing in that one physical processor performs operations on a group of items simulataneously while parallel processing aims to split a larger problem into smaller ones (MIMD) to be solved piecemeal by numerous processors. Processors with two or more cores are increasingly common today.

Languages

The canonical examples of array programming languages are APL, J, and Fortran 90. Others include: A+, Analytica, IDL, K, Q, Mathematica, MATLAB, MOLSF, NumPy, GNU Octave, PDL, R, S-Lang, SAC, Nial and ZPL.

Category:Array programming languages provides an exhaustive list.

Examples

In scalar languages like FORTRAN 77, C, Pascal, etc. operations apply only to single values, so a+b expresses the addition of two numbers. In such languages adding two arrays requires indexing and looping:

FORTRAN 77

00  DO 10 I = 1, N
        DO 10 J = 1, N
10         A(J,I) = A(J,I) + B(J,I)

C

for (i = 0; i < n; i++)
    for (j = 0; j < n; j++)
        a[i][j] += b[i][j];

PASCAL

for i:=1 to n do
    for j:=1 to n do
        a[i,j] := a[i,j] + b[i,j];

This need to loop and index to perform operations on arrays is both tedious and error prone.

In array languages, operations are generalized to apply to both scalars and arrays. Thus, a+b expresses the sum of two scalars if a and b are scalars, or the sum of two arrays if they are arrays. When applied to arrays, the operations act on corresponding elements as illustrated in the loops above. Indeed, when the array language compiler/interpreter encounters a statement like:

Ada [2]

 with Ada.Numerics.Generic_Real_Arrays;
 
 A := A + B;

where A and B are two-dimensional arrays, it generates code that is effectively the same as the C loops shown above. An array language, therefore, simplifies programming but may come at a cost known as the abstraction penalty [3][4][5]. Because the additions are performed in isolation to the rest of the coding, it may not produce the optimally most efficient code (for example if additions of other elements of the same array are subsequently encountered during the same execution, causing unnecessary repeated lookups). Even the most sophisticated optimizing compiler would have an extremely hard time amalgamating two or more apparently disparate functions which might appear in different program sections or sub-routines (yet this would be entirely obvious to a programmer who would naturally try to ensure the sums were aggregated on the same 'pass' of the array to minimize overhead).

As another example, the following generates the inner product of two arrays (matrix multiplication) in the array language Nial:

a inner[+,*] b

Notice how the operations are sequenced on the arrays.

See also

References

External links

Personal tools
  • Log in / create account
Namespaces
Variants
Actions
Toolbox
Print/export
Languages