Monday, July 5, 2010

Advanced linux programming Simplified

 Hi I have compiled  ALP book in following sessions. The following sessions would be enough to debug Desktop linux and embedded linux applications.
 Starts with source code editing,debugging, basics of shared libraries,  use of dlopen & dlclose,GNU Makefile,Emacs,etc. Material is Very good for beginners.

                    SUMMERY OF ALP CHAPTER-01


1.About Emacs

    Emacs is much more than an editor. It is an incredibly powerful program, so much so that at
CodeSourcery, it is affectionately known as the One True Program, or just the OTP for short. You can read
and send email from within Emacs.

C-- Stands for Ctrl Key.
M-- Stands for Esc Key.

BASICS:
1.C-x + C-f ====> Open new file in Emacs.
2.C-x + C-s ====> To save current editing file.
3.C-x + C-c ====> To Exit.
4.C-p + C-n ====> Previour and Next Lines.
5.C-b + C-f ====> Moving to Backward and Forward  charecters of a line.
6.C-x + u   ====> Undoing.
7.C-v       ====> Moving to Next Screen(Like Pagedown).
8.M-b       ====> Moving a word backward.
9.M-f         ====> Moving a word Forward.
10.C-d        ====> Delete Next Char.
11.M-d        ====> Delete Next Word.
12.C-k      ====> Delete till end of line.

For Auto Formatting.

############################################################################################################
2.Compiling with GCC and
3. Automating the Process with GNU Make
    Go thru "reciprocal example" ALP_Chapter1 folder
It describes how to write a simple makefile and how to link....
############################################################################################################

4.Running GDB

   Please go thru GDB reference manual.


############################################################################################################

5. a.Man Pages
(1) User commands  i.e, man sleep
(2) System calls   i.e, man 2 sleep
(3) Standard library functions i.e, man 3 printf
(8) System/administrative commands 
  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
5.b Info

    The Info documentation system contains more detailed documentation for many core
components of the GNU/Linux system, plus several other programs. Info pages are
hypertext documents, similar to Web pages.To launch the text-based Info browser, just
type info in a shell window.You'll be presented with a menu of Info documents
installed on your system. (Press Control+H to display the keys for navigating an Info
document.)
    Among the most useful Info documents are these:
1.gcc-The gcc compiler
2.libc-The GNU C library, including many system calls
3.gdb-The GNU debugger
4.emacs-The Emacs text editor
5.info—The Info system itself
    Almost all the standard Linux programming tools (including ld, the linker; as, the
assembler; and gprof, the profiler) come with useful Info pages.You can jump directly
to a particular Info document by specifying the page name on the command line:

example:
info libc

############################################################################################################

                        ALP-Chapter-02-Summary


2.1 INTERACTION WITH WITH THE EXECUTION ENVIRONMENT

1.main function ====> Entry point for an Executable.
2.argc and argv ====> receive command line inputs to program.
3.stdout and stdin ====> similar to cout and cin streams in C++ that
provides console input and output.

----------------------------------------------------------------------------------------------------


2.1.1 The Argument List
    User supply additional information to the program by typing one or
more words after the program name, separated by spaces.These are called command-line
arguments.
    When a program is invoked from the shell, the argument list contains the entire
command line, including the name of the program and any command-line arguments
that may have been provided. Suppose, for example, that you invoke the ls command
in your shell to display the contents of the root directory and corresponding file sizes
with this command line:

     The main function of a user program can access the argument list via the argc and
argv parameters to main .
The first parameter, argc, is an integer that is set to the number of items in the argument list.
The second parameter, argv, is an array of character pointers.The size of the array is
argc, and the array elements point to the elements of the argument list, as NULL terminated
character strings.

go thru arglist.c

---------------------------------------------------------------------------------------------------

2.1.2 GNU/Linux Command-Line Conventions

      The arguments that Unix commands expect fall into two categories:
i.options (or flags) - Options modify how the program behaves.
ii.other arguments - other arguments provide inputs (for instance, the names of input files).

Options come in two forms:

1.Short options consist of a single hyphen and a single character (usually a lowercase
or uppercase letter). Short options are quicker to type.

2. Long options consist of two hyphens, followed by a name made of lowercase and
uppercase letters and hyphens. Long options are easier to remember and easier
to read (in shell scripts, for instance).

    Example :
            ls -l -->short option for longlisting of files inside a directory ls -long

--------------------------------------------------------------------------------------------------

2.1.3 Using getopt_long

      To use getopt_long, user must provide two data structures.

1.character string containing the valid short options, each a single letter. An option that requires
an argument is followed by a colon. For your program, the string "ho:v" indicates that
the valid options are -h, -o, and -v, with the second of these options followed by an
argument.

2.To specify the available long options, you construct an array of struct option elements.
Each element corresponds to one long option and has four fields.

 In normal circumstances, the first field is the name of the long option (as a character string, without
the two hyphens); the second is 1 if the option takes an argument, or 0 otherwise;
the third is NULL; and the fourth is a character constant specifying the short option
synonym for that long option.The last element of the array should be all zeros.
You
could construct the array like this:

const struct option long_options[] = {
{ "help", 0, NULL, "h" },
{ "output", 1, NULL, "o" },
{ "verbose", 0, NULL, "v" },
{ NULL, 0, NULL, 0 }
};

    Invoke the getopt_long function, passing it the argc and argv arguments to main,
the character string describing short options, and the array of struct option elements
describing the long options.

NOTE:
1. Each time you call getopt_long, it parses a single option, returning the shortoption
letter for that option, or  if no more options are found.

2. Typically, you'll call getopt_long in a loop, to process all the options the user has
specified, and you'll handle the specific options in a switch statement.

3. If getopt_long encounters an invalid option (an option that you didn't specify as
a valid short or long option), it prints an error message and returns the character
? (a question mark). Most programs will exit in response to this, possibly after
displaying usage information.

4. When handling an option that takes an argument, the global variable optarg
points to the text of that argument.

5 After getopt_long has finished parsing all the options, the global variable optind
contains the index (into argv) of the first nonoption argument.


-------------------------------------------------------------------------------------------

2.1.4 Standard I/O

      The standard C library provides standard input and output streams (stdin and stdout,
respectively).These are used by scanf, printf, and other library functions.

      In the UNIX tradition, use of standard input and output is customary for GNU/Linux programs.
This allows the chaining of multiple programs using shell pipes and input and output redirection.

     The C library also provides stderr, the standard error stream. Programs should
print warning and error messages to standard error instead of standard output.This
allows users to separate normal output and error messages, for instance, by redirecting
standard output to a file while allowing standard error to print on the console.
     Thefprintf function can be used to print to stderr.

example:fprintf (stderr, ("Error: ..."));

These three streams are also accessible with the underlying UNIX I/O commands
(read, write, and so on) via file descriptors.

These are file descriptors 0 for stdin, 1 for stdout, and 2 for stderr.

      When invoking a program, it is sometimes useful to redirect both standard output
and standard error to a file or pipe.The syntax for doing this varies among shells; for
Bourne-style shells (including bash, the default shell on most GNU/Linux distributions),
the syntax is this:

% program > output_file.txt 2>&1
% program 2>&1 | filter

    The 2>&1 syntax indicates that file descriptor 2 (stderr) should be merged into
file descriptor 1 (stdout). Note that 2>&1 must follow a file redirection (the first example)
but must precede a pipe redirection (the second example)

    Note that stdout is buffered. Data written to stdout is not sent to the console
(or other device, if it's redirected) until the buffer fills, the program exits normally, or
stdout is closed.You can explicitly flush the buffer by calling the following:
fflush (stdout);
    In contrast, stderr is not buffered; data written to stderr goes directly to the console.
1This can produce some surprising results.
For example, this loop does not print one
period every second; instead, the periods are buffered, and a bunch of them are printed
together when the buffer fills.

while (1) {
printf (".");
sleep (1);
}

In this loop, however, the periods do appear once a second:
while (1) {
fprintf (stderr, ".");
sleep (1);
}

     For above go through sleep.c,stdout_buff.c.stderr_nonbuff.c


--------------------------------------------------------------------------------------------
2.1.5 Program Exit Codes


      When a program ends, it indicates its status with an exit code.The exit code is a
small integer; by convention, an exit code of zero denotes successful execution,
while nonzero exit codes indicate that an error occurred. Some programs use different
nonzero exit code values to distinguish specific errors.

    With most shells, it's possible to obtain the exit code of the most recently executed
program using the special $? variable.

    Here's an example in which the ls command isinvoked twice and its exit code is printed
 after each invocation. In the first case, ls executes correctly and returns the exit code zero.
 In the second case, ls encounters an error (because the filename specified on the command line
 does not exist) and thus returns a nonzero exit code.

% ls /
bin coda etc lib misc nfs proc sbin usr
boot dev home lost+found mnt opt root tmp var

% echo $?
0

% ls bogusfile
ls: bogusfile: No such file or directory
% echo $?
1

NOTE:
1. In C++, the same distinction holds for cout and cerr, respectively. Note that the endl
token flushes a stream in addition to printing a newline character; if you don't want to flush the
stream (for performance reasons, for example), use a newline constant, \n, instead.
2. There are other methods of providing exit codes, and special exit codes
are assigned to programs that terminate abnormally (by a signal).

-----------------------------------------------------------------------------------------------------
2.1.6 The Environment
      GNU/Linux provides each running program with an environment.The environment is
a collection of variable/value pairs. Both environment variable names and their values
are character strings. By convention, environment variable names are spelled in all
capital letters.

For instance:
1.USER contains your username.
2.HOME contains the path to your home directory.
3.PATH contains a colon-separated list of directories through which Linux searches
for commands.
4.DISPLAY contains the name and display number of the X Window server on
which windows from graphical X Window programs will appear.
5.LD_LIBRARY_PATH explained later.
      Shells provide methods for examining and modifying the environment directly.To print
the current environment in your shell, invoke the printenv program.Various shells have
different built-in syntax for using environment variables; the following is the syntax
for Bourne-style shells.

NOTE:
1.The shell automatically creates a shell variable for each environment variable
that it finds, so you can access environment variable values using the $varname
syntax.

For instance:
% echo $USER
samuel
% echo $HOME
/home/samuel

      Use the export command to export a shell variable into the environment.

example to set the EDITOR environment variable,

% EDITOR=emacs

% export EDITOR
Or, for short:

% export EDITOR=emacs

  In a  C/C++ program,
1.to access an environment variable using getenv function in
.That function takes a variable name and returns the corresponding value
as a character string, or NULL if that variable is not defined in the environment.

2.to set or clear environment variables, use the setenv and unsetenv functions, respectively.

     For Enumerating all the variables in the environment must access a special global variable
named environ, which is defined in the GNU C library.This variable, of type char**, is a
NULL-terminated array of pointers to character strings.Each string contains one environment
variable, in the form VARIABLE=value.

    Go through client.c .
--------------------------------------------------------------------------------------------------
2.1.7 Using Temporary Files


      Sometimes a program needs to make a temporary file, to store large data for a while or
to pass data to another program. On GNU/Linux systems, temporary files are stored
in the /tmp directory.
When using temporary files, one should be aware of the following
pitfalls:
1. More than one instance of your program may be run simultaneously (by the
same user or by different users).The instances should use different temporary
filenames so that they don't collide.

2. The file permissions of the temporary file should be set in such a way that
unauthorized users cannot alter the programâs execution by modifying or
replacing the temporary file.
3. Temporary filenames should be generated in a way that cannot be predicted
externally; otherwise, an attacker can exploit the delay between testing whether
a given name is already in use and opening a new temporary file.

GNU/Linux provides functions,
1.mkstemp
2.tmpfile

    Using mkstemp()

    The mkstemp function creates a unique temporary filename from a filename template.

>>>creates the file with permissions so that only the current user can access it, and opens
the file for read/write. mkstemp returns fd of the created file.

>>>The filename template is a character string ending with “fileXXXXXX” (six capital X’s)

>>>mkstemp replaces the X’s with characters so that the filenameis unique.The return value
 is a file descriptor; use the write family of functions to write to the temporary file

>>Temporary files created with mkstemp are not deleted automatically. It’s up to he user
to remove the temporary file when it’s no longer needed.

>> after using the file  unlink on the temporary file immediately.The unlink function removes
 the directory entry corresponding to a file, but because files in a file system are reference-counted,
the file itself is not removed until there are no open file descriptors for that file, either.This way,
your program may continue to use the temporary file, and the file goes away automatically
as soon as you close the file descriptor.

    Using tmpfile()

    A standard C library function and support standard c library I/O functions(fopen,fclose,...)
and don’t need to pass the temporary file to another program.This creates and opens a temporary
file, and returns a file pointer to it.
The temporary file is already unlinked as in temp_file.created temporary  is deleted automatically when the file
pointer is closed (with fclose) or when the program terminates.

------------------------------------------------------------------------------------------------------------------

2.2 Coding Defensively
    This section demonstrates some coding techniques for finding bugs early and for detecting
 and recovering from problems in a running program.

2.2.1 Using assert

    A good objective to keep in mind when coding application programs is that bugs or
unexpected errors should cause the program to fail dramatically, as early as possible.
This will help you find bugs earlier in the development and testing cycles.

    One of the simplest methods to check for unexpected conditions is the standard C
assert macro.The argument to this macro is a Boolean expression.The program is
terminated if the expression evaluates to false, after printing an error message containing
the source file and line number and the text of the expression.

1. The assert macro is very useful for a wide variety of consistency checks internal
to a program. For instance, use assert to test the validity of function arguments, to
test preconditions and postconditions of function calls (and method calls, in C++),
and to test for unexpected return values.

2. Each use of assert serves not only as a runtime check of a condition, but also as
documentation about the programs, operation within the source code. If the program
contains an assert (condition) that says to someone reading your source code that
condition should always be true at that point in the program, and if condition is not
true, it’s probably a bug in the program.

3.For performance-critical code, runtime checks such as uses of assert can impose a
significant performance penalty.
In these cases, you can compile your code with the NDEBUG macro defined, by using the -DNDEBUG
flag on your compiler command line. With NDEBUG set, appearances of the assert macro will be
preprocessed away. It's a good idea to do this only when necessary for performance reasons,
though, and only with performance-critical source files.
Because it is possible to preprocess assert macros away, be careful that any expression
you use with assert has no side effects. Specifically, you shouldn’t call functions
inside assert expressions, assign variables, or use modifying operators such as ++.

2.2 Coding Defensively
    Suppose, for example, that you call a function, do_something, repeatedly in a loop.
The do_something function returns zero on success and nonzero on failure, but you
don't expect it ever to fail in your program.You might be tempted to write:

for (i = 0; i < 100; ++i)
assert (do_something () == 0);

    However, you might find that this runtime check imposes too large a performance
penalty and decide later to recompile with NDEBUG defined.This will remove the assert
call entirely, so the expression will never be evaluated and do_something will
never be called.
one should write this instead:

for (i = 0; i < 100; ++i) {
int status = do_something ();
assert (status == 0);
}
    Another thing to bear in mind is that you should not use assert to test for invalid
user input. Users don't like it when applications simply crash with a cryptic error message,
even in response to invalid input.You should still always check for invalid input
and produce sensible error messages in response input. Use assert for internal runtime
checks only.

Some good places to use assert are these:
---> Check against null pointers, for instance, as invalid function arguments.The error
message generated by {assert (pointer != NULL)},
Assertion 'pointer != ((void *)0)' failed.
is more informative than the error message that would result if your program
dereferenced a null pointer:
Segmentation fault (core dumped)

---> Check conditions on function parameter values. For instance, if a function
should be called only with a positive value for parameter foo, use this at the
beginning of the function body:
assert (foo > 0);
This will help you detect misuses of the function, and it also makes it very clear
to someone reading the function's source code that there is a restriction on the
parameter's value.Use assert liberally throughout your programs.


2.2.2 System Call Failures

      Computers have limited resources; hardware fails; many programs execute at the same
time; users and programmers make mistakes. It's often at the boundary between the
application and the operating system that these realities exhibit themselves.Therefore,
when using system calls to access system resources, to perform I/O, or for other purposes,
it's important to understand not only what happens when the call succeeds, but
also how and when the call can fail.

System calls can fail in many ways.
For example:
--->The system can run out of resources (or the program can exceed the resource
limits enforced by the system of a single program).
Ex: The program might try to allocate too much memory, to write too much to a disk,
 or to open too many files at the same time.

---> Linux may block a certain system call when a program attempts to perform an
operation for which it does not have permission. For example, a program might
attempt to write to a file marked read-only, to access the memory of another
process, or to kill another user's program.

---> The arguments to a system call might be invalid, either because the user provided
invalid input or because of a program bug.

Ex: The program might pass an invalid memory address or an invalid file descriptor to a system
call. Or, a program might attempt to open a directory as an ordinary file, or
might pass the name of an ordinary file to a system call that expects a directory.

---> A system call can fail for reasons external to a program.This happens most often
when a system call accesses a hardware device.The device might be faulty or
might not support a particular operation, or perhaps a disk is not inserted in the
drive.

--->A system call can sometimes be interrupted by an external event, such as the
delivery of a signal.This might not indicate outright failure, but it is the responsibility
of the calling program to restart the system call, if desired.

   In a well-written program that makes extensive use of system calls, it is often the case
that more code is devoted to detecting and handling errors and other exceptional circumstances
than to the main work of the program.

-----------------------------------------------------------------------------------------
2.2.3 Error Codes from System Calls

      A majority of system calls return zero if the operation succeeds, or a nonzero value if
the operation fails.Some have different return value conventions; for example
instance, malloc returns a null pointer to indicate failure.

This information may be enough to determine whether the program should continue execution as usual,
But it probably does not provide enough information for a sensible recovery from errors.
Most system calls use a special variable named 'errno' to store additional information
in case of failure.

--->errno is a standard global variable defined as extern in errno.h.

--->When a call fails, the system sets errno to a value indicating what went wrong. Because all
system calls use the same errno variable to store error information,developer copy the value into
another variable immediately after the failed call.The value of errno will be overwritten the next
time you make a system call.

--->Error values are integers; possible values are given by preprocessor macros, by convention
named in all capitals and starting with 'E' for example, EACCES and EINVAL.
Always use these macros to refer to errno values rather than integer values. Include the
header for errno values.

--->GNU/Linux provides a convenient function, strerror, that returns a character
string description of an errno error code, suitable for use in error messages. Include
forstrerror.

--->GNU/Linux also provides perror, which prints the error description directly to
the stderr stream. Pass to perror a character string prefix to print before the error
description, which should usually include the name of the function that failed. Include
for perror.

      Refer erro.c for example usage.

---> some system calls such as read, select, and sleep, can take significant
time to execute.These are considered blocking functions because program execution
is blocked until the call is completed. However, if the program receives a signal
while blocked in one of these calls, the call will return without completing the operation.
In this case, errno is set to EINTR.

Please note that user Id will be 0 for root user

Go thru a good example with chown.c

----------------------------------------------------------------------------------------------------------
2.2.4 Errors and Resource Allocation --very Impartant topic

      When a system call fails, it's appropriate to cancel the current operation but not
to terminate the program because it may be possible to recover from the error. One
way to do this is to return from the current function, passing a return code to the
caller indicating the error.
       If you decide to return from the middle of a function, it's important to make sure
that any resources successfully allocated previously in the function are first deallocated.
These resources can include memory, file descriptors, file pointers, temporary files,
synchronization objects, and so on. Otherwise, if your program continues running, the
resources allocated before the failure occurred will be leaked.


Ex:Consider, a function that reads from a file into a buffer.The function
might follow these steps:
1. Allocate the buffer.
2. Open the file.
3. Read from the file into the buffer.
4. Close the file.
5. Return the buffer.
   If the file doesn't exist, Step 2 will fail. An appropriate course of action might be to
return NULL from the function. However, if the buffer has already been allocated in
Step 1, there is a risk of leaking that memory.You must remember to deallocate the
buffer somewhere along any flow of control from which you don't return. If Step 3
fails, not only must you deallocate the buffer before returning, but you also must close
the file.

Go through readfile.c also go through readfile_getopt application

----------------------------------------------------------------------------------------------------------------

2.3 Writing and Using Libraries
    Virtually all programs are linked against one or more libraries.
For Example 1.Any program that uses a C function (such as printf or malloc) will be linked
 against the C runtime library.
        2. If the program has a graphical user interface (GUI), it will be linked against
 windowing libraries.
         3.If the program uses a database, the database provider will give you libraries
that you can use to access the database conveniently.

In above developer must decide whether to link the library statically or dynamically.

--->If  the linkage is statically, the programs will be bigger and harder to
upgrade, but probably easier to deploy.

--->If the linkage is dynamically, the programs will be smaller, easier to upgrade, but harder to
deploy.This section explains how to link both statically and dynamically, examines the trade-offs
in more detail, and gives some “rules of thumb” for deciding which kind of linking is better for the developer.
-------------------------------------------------------------------------------------------------------------
2.3.1 Archives
    An archive (or static library) is simply a collection of object files stored as a single file.
(An archive is roughly the equivalent of a Windows .LIB file.)

--->When you provide an archive to the linker, the linker searches the archive for the object files it
needs, extracts them, and links them into your program much as if you had provided those
object files directly.
---> archive can be created using the ar command.Archive files traditionally use a .a
extension rather than the .o extension used by ordinary object files.

Go thru static directory for illusatration which creates libtest.a combining test1.o and test2.o

% ar cr libtest.a test1.o test2.o
The cr flags tell ar to create the archive.

---> link with this archive using the -ltest option with gcc or g++.

--->    When the linker encounters an archive on the command line, it searches the
archive for all definitions of symbols (functions or variables) that are referenced from
the object files that it has already processed but not yet defined.

--->The object files that define those symbols are extracted from the archive and included
in the final executable. Because the linker searches the archive when it is encountered on
the command  line, it usually makes sense to put archives at the end of the command line.

--------------------------------------------------------------------------------------------

2.3.2 Shared Libraries
    A shared library (also known as a shared object, or as a dynamically linked library)
 is similar to a archive in that it is a grouping of object files.
However, there are many important differences.
--->The most fundamental difference is that when a shared library is linked into a program,
the final executable does not actually contain the code that is present in the shared library.
Instead, the executable merely contains a reference to the shared library.

---> If several programs on the system are linked against the same shared
library, they will all reference the library, but none will actually be included.Thus, the
library is “shared” among all the programs that link with it.

--->A shared library is not merely a collection of object files, out of which the linker chooses
those that are needed to satisfy undefined references. Instead, the object files that compose
the shared library are combined into a single object file so that a program that links against a
shared library always includes all of the code in the library, rather than just those portions
that are needed.

---To create a shared library, you must compile the objects that will make up the
library using the -fPIC option to the compiler,

% gcc -c -fPIC test1.c
The -fPIC option tells the compiler that you are going to be using test.o as part of a
shared object.

Position-Independent Code (PIC):

    PIC stands for position-independent code. The functions in a shared library may be loaded at
 different addresses in different programs, so the code in the shared object must not depend on the address
 at which it is loaded.

Then you combine the object files into a shared library, like this:

% gcc -shared -fPIC -o libtest.so test1.o test2.o
    The -shared option tells the linker to produce a shared library rather than an ordinary
executable. Shared libraries use the extension .so, which stands for SHARED OBJECT. Like
static archives, the name always begins with lib to indicate that the file is a library.
Linking with a shared library is just like linking with a static archive.

For example, the following line will link with libtest.so if it is in the current directory, or one of
the standard library search directories on the system:

% gcc -o app app.o -L. –ltest
    Suppose that both libtest.a and libtest.so are available.Then the linker must
choose one of the libraries and not the other.

***---> The linker searches each directory first those specified with -L options, and then those
in the standard directories.

--->When the linker finds a directory that contains either libtest.a or libtest.so, the linker stops
search directories. If only one of the two variants is present in the directory, the linker
chooses that variant. Otherwise, the linker chooses the shared library version, unless
you explicitly instruct it otherwise.

--->use the -static option to demand static archives.

The following line will use the libtest.a archive, even if the libtest.so shared library is also available:

% gcc -static -o app app.o -L. –ltest

--->The ldd command displays the shared libraries that are linked into an executable.These libraries need
to be available when the executable is run.

--->Note that ldd will list an additional library called ld-linux.so, which is a part of GNU/Linux’s dynamic
linking mechanism.

Using LD_LIBRARY_PATH

--->When linking a program with a shared library, the linker does not put the full path
to the shared library in the resulting executable. Instead, it places only the name of the
shared library.When the program is actually run, the system searches for the shared
library and loads it.The system searches only /lib and /usr/lib, by default. If a shared
library that is linked into your program is installed outside those directories, it will not
be found, and the system will refuse to run the program.

1.One solution to this problem is to use the -Wl,-rpath option when linking the
program. Suppose that you use this:
% gcc -o app app.o -L. -ltest -Wl,-rpath,/usr/local/lib
Then, when app is run, the system will search /usr/local/lib for any required shared
libraries.

2.Another solution to this problem is to set the LD_LIBRARY_PATH environment
variable when running the program.

----------------------------------------------------------------------------------------------
2.3.3 Standard Libraries
    Even if you didn’t specify any libraries when you linked your program, it almost certainly
uses a shared library.That’s because GCC automatically links in the standard C
library, libc.The standard C library math functions are not included in libc;
instead, they’re in a separate library, libm, which you need to specify explicitly.

For example, to compile and link a program compute.c which uses trigonometric functions
such as sin and cos, you must invoke this code:

% gcc -o compute compute.c –lm
    If you write a C++ program and link it using the c++ or g++ commands, you’ll also
get the standard C++ library, libstdc++, automatically.
-----------------------------------------------------------------------------------------

2.3.4 Library Dependencies
    One library will often depend on another library. For example, many GNU/Linux
systems include libtiff, a library that contains functions for reading and writing
image files in the TIFF format.This library, in turn, uses the libraries libjpeg (JPEG
image routines) and libz (compression routines).

Listing 2.9 shows a very small program that uses libtiff to open a TIFF image file.
Listing 2.9 (tifftest.c) Using libtiff
#include
#include
int main (int argc, char** argv)
{
TIFF* tiff;
tiff = TIFFOpen (argv[1], “r”);
TIFFClose (tiff);
return 0;
}

--------------------------------------------------------------------------------------
######HA-HA-HA-Ha###############
You might see a reference to LD_RUN_PATH in some online documentation. Don’t believe
what you read; this variable does not actually do anything under GNU/Linux.
######HA-HA-HA-Ha###############

    Save this source file as tifftest.c.To compile this program and link with libtiff,
specify -ltiff on your link line:
% gcc -o tifftest tifftest.c –ltiff

By default, this will pick up the shared-library version of libtiff, found at
/usr/lib/libtiff.so. Because libtiff uses libjpeg and libz, the shared-library
versions of these two are also drawn in (a shared library can point to other shared
libraries that it depends on).To verify this, use the ldd command:

% ldd tifftest
libtiff.so.3 => /usr/lib/libtiff.so.3 (0x4001d000)
libc.so.6 => /lib/libc.so.6 (0x40060000)
libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0x40155000)
libz.so.1 => /usr/lib/libz.so.1 (0x40174000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
Static libraries, on the other hand, cannot point to other libraries. If decide to link
with the static version of libtiff by specifying -static on your command line, you
will encounter unresolved symbols:

% gcc -static -o tifftest tifftest.c -ltiff
/usr/bin/../lib/libtiff.a(tif_jpeg.o): In function ‘TIFFjpeg_error_exit’:
tif_jpeg.o(.text+0x2a): undefined reference to ‘jpeg_abort’
/usr/bin/../lib/libtiff.a(tif_jpeg.o): In function ‘TIFFjpeg_create_compress’:
tif_jpeg.o(.text+0x8d): undefined reference to ‘jpeg_std_error’
tif_jpeg.o(.text+0xcf): undefined reference to ‘jpeg_CreateCompress’
...
To link this program statically, you must specify the other two libraries yourself:


% gcc -static -o tifftest tifftest.c -ltiff -ljpeg -lz
Occasionally, two libraries will be mutually dependent. In other words, the first archive
will reference symbols defined in the second archive, and vice versa.This situation
generally arises out of poor design, but it does occasionally arise. In this case, you can
provide a single library multiple times on the command line.The linker will research
the library each time it occurs. For example, this line will cause libfoo.a to be
searched multiple times:

% gcc -o app app.o -lfoo -lbar –lfoo
So, even if libfoo.a references symbols in libbar.a, and vice versa, the program will
link successfully.
-----------------------------------------------------------------------------------------

2.3.5 Pros and Cons of shared libraries

***--->    One major advantage of a shared library is that it saves space on the system where
the program is installed. If you are installing 10 programs, and they all make use of the
same shared library, then you save a lot of space by using a shared library. If you used a
static archive instead, the archive is included in all 10 programs. So, using shared
libraries saves disk space. It also reduces download times if your program is being
downloaded from the Web.

***--->A related advantage to shared libraries is that users can upgrade the libraries without
upgrading all the programs that depend on them.

 For example, suppose that you produce a shared library that manages HTTP connections. Many
programs might depend on this library. If you find a bug in this library, you can upgrade
the library. Instantly, all the programs that depend on the library will be fixed; you don’t
have to relink all the programs the way you do with a static archive.
Those advantages might make you think that you should always use shared
libraries.
---------------------------------------------------------------------------------------------------
2.3.6 Dynamic Loading and Unloading
    Sometimes you might want to load some code at run time without explicitly linking
in that code.
This functionality is available under Linux by using the dlopen function.You could
open a shared library named libtest.so by calling dlopen like this:
dlopen (“libtest.so”, RTLD_LAZY)

void* handle = dlopen (“libtest.so”, RTLD_LAZY);
void (*test)() = dlsym (handle, “my_function”);
(*test)();
dlclose (handle);
    The dlsym system call can also be used to obtain a pointer to a static variable in the
shared library. Both dlopen and dlsym return NULL if they do not succeed. In that event, you
can call dlerror (with no parameters) to obtain a human-readable error message describing the problem.

***--->    The dlclose function unloads the shared library.Technically, dlopen actually loads
the library only if it is not already loaded. If the library has already been loaded,
dlopen simply increments the library reference count. Similarly, dlclose decrements
the reference count and then unloads the library only if the reference count has
reached zero.

***--->If you’re writing the code in your shared library in C++, you will probably want
to declare those functions and variables that you plan to access elsewhere with the
extern “C” linkage specifier. For instance, if the C++ function my_function is in a
shared library and you want to access it with dlsym, you should declare it like this:
extern “C” void foo ();

                        ALP-Chapter-02-Summary

    A RUNNING INSTANCE OF A PROGRAM IS CALLED A PROCESS.

Ex: 1.Two terminal windows showing on a screen runs two terminal processes. Each terminal
window is probably running a shell; each running shell is another process.When you
invoke a command from a shell, the corresponding program is executed in a new
process; the shell process resumes when that process completes.

    2.Advanced programmers often use multiple cooperating processes in a single application
to enable the application to do more than one thing at once, to increase application robustness,
and to make use of already-existing programs.

------------------------------------------------------------------------------------------------
3.1 Looking at Processes
    Even as you sit down at your computer, there are processes running. Every executing
program uses one or more processes. Let’s start by taking a look at the processes
already on your computer.

------------------------------------------------------------------------------------------------
3.1.1 Process IDs

--->Each process in a Linux system is identified by its unique process ID, referred to as pid.

--->Process IDs are 16-bit numbers(short integer) that are assigned sequentially by Linux as new processes are created.

--->Every process also has a parent process except the special init process .

--->Processes on a Linux system are arranged in a tree, with the init process at its root.The parent
process ID, or ppid, is simply the process ID of the process’s parent.

--->When referring to process IDs in a C or C++ program, always use the pid_t
typedef, which is defined in .

--->A program can obtain the process ID of the process it’s running in with the getpid() system call,
and it can obtain the process ID of its parent process with the getppid() system call.

Go through ALP_Chapter3 pid.c

------------------------------------------------------------------------------------------------
3.1.2 Viewing Active Processes
    The ps command displays the processes that are running on your system.The
GNU/Linux version of ps has lots of options because it tries to be compatible with
versions of ps on several other UNIX variants.These options control which processes
are listed and what information about each is shown.
--->By default, invoking ps displays the processes controlled by the terminal or terminal
window in which ps is invoked. For example:

% ps
PID TTY TIME CMD
21693 pts/8 00:00:00 bash
21694 pts/8 00:00:00 ps

--->This invocation of ps shows two processes.The first, bash, is the shell running on this
terminal.The second is the running instance of the ps program itself.The first column,
labeled PID, displays the process ID of each.

For a more detailed look invoke this:
% ps -e -o pid,ppid,command
    The -e option instructs ps to display all processes running on the system.The
-o pid,ppid,command option tells ps what information to show about each process—
in this case, the process ID, the parent process ID, and the command running in this
process.

ps Output Formats
    With the -o option to the ps command, you specify the information about processes that you want in
the output as a comma-separated list.
--->Ex:ps -o pid,user,start_time,command
The above command displays the process ID, the name of the user owning the process, the wall clock time at
which the process started, and the command running in the process. See the man page for ps for the full list
of field codes.

--->use the -f (full listing), -l (long listing), or -j (jobs listing) options instead to get three different
preset listing formats.Here are the first few lines and last few lines of output from this command on my
system.You may see different output, depending on what’s running on your system.

% ps -e -o pid,ppid,command
PID PPID COMMAND
1 0 init [5]
2 1 [kflushd]
3 1 [kupdate]
...
21725 21693 xterm
21727 21725 bash
21728 21727 ps -e -o pid,ppid,command
--->Note that the parent process ID of the ps command, 21727, is the process ID of bash,
the shell from which I invoked ps.The parent process ID of bash is in turn 21725, the
process ID of the xterm program in which the shell is running.

-----------------------------------------------------------------------------------------------------------
3.1.3 Killing a Process
    You can kill a running process with the kill command. Simply specify on the command
line the process ID of the process to be killed.
--->The kill command works by sending the process a SIGTERM, or termination,
signal.
--->This causes the process to terminate, unless the executing program explicitly
handles or masks the SIGTERM signal.

-----------------------------------------------------------------------------------------------------------
3.2 Creating Processes
    Two common techniques are used for creating a new process.
1.using "system" sys call.
2.fork and exec

    The first is relatively simple but should be used sparingly because it is inefficient
and has considerably security risks.The second technique is more complex but provides greater
flexibility, speed, and security.
-----------------------------------------------------------------------------------------------------------

3.2.1 Using system
    The system function in the standard C library provides an easy way to execute a
command from within a program, much as if the command had been typed into a
shell.

--->"system" creates a subprocess running the standard Bourne shell (/bin/sh)
and hands the command to that shell for execution.

Ex:Go through system.c of ALP_Chapter3

--->The system function returns the exit status of the shell command.

--->If the shell itself cannot be run, system returns 127; if another error occurs, system returns –1.

--->Because the system function uses a shell to invoke your command, it’s subject to the features, limitations,
and security flaws of the system’s shell.

----------------------------------------------------------------------------------------------------------------

3.2.2 Using fork and exec
    fork, makes a child process that is an exact copy of its parent process. Linux provides another set of functions,
the exec family, that causes a particular process to cease being an instance of one program and to instead become an
instance of another program.To spawn a new process, you first use fork to make a copy of the current process.Then you
use exec to transform one of these processes into an instance of the program you want to spawn.
-------------------------------------------------------------------------------------------------------------------

1.Calling fork
    When a program calls fork, a duplicate process, called the child process, is created.The parent process
continues executing the program from the point that fork was called. The child process, too, executes the same program
from the same place.

--->child process is a new process and therefore has a new process ID, distinct from its parent’s process ID.

--->One way for a program to distinguish whether it’s in the parent process or the child process is to call
getpid. However, the fork function provides different return values to the parent and child processes—one process “goes in”
to the fork call, and two processes “come out,” with different return values.The return value in the parent process is
the process ID of the child.The return value in the child process is zero. Because no process ever has a
process ID of zero, this makes it easy for the program whether it is now running as the
parent or the child process.

GO through fork.c of ALP_Chapter3
------------------------------------------------------------------------------------------------------------------

2.Using the exec Family
    The exec functions replace the program running in a process with another program.
---> When a program calls an exec function, that process immediately ceases executing that program and begins executing
a new program from the  beginning, assuming that the exec call doesn’t encounter an error.

--->Within the exec family, there are functions that vary slightly in their capabilities and how they are called.

--->Functions that contain the letter p in their names (execvp and execlp) accept a program name and search for a
program by that name in the current execution path

--->functions that don’t contain the p must be given the full path of the program to be executed.

--->functions that contain the letter v in their names (execv, execvp, and execve)
accept the argument list for the new program as a NULL-terminated array of
pointers to strings.

--->Functions that contain the letter l (execl, execlp, and execle) accept the argument list using the
C language’s varargs mechanism.

---> Functions that contain the letter e in their names (execve and execle) accept an additional argument,
an array of environment variables.The argument should be a NULL-terminated array of pointers to character strings.

--->Each character string should be of the form “VARIABLE=value”. Because exec replaces the calling program with
 another one, it never returns unless an error occurs.

-------------------------------------------------------------------------------------------------------------------------
3.Using fork and exec Together
    A common pattern to run a subprogram within a program is first to fork the process  and then exec the subprogram.
This allows the calling program to continue execution in the parent process while the calling program is replaced by the
subprogram in the child process.

  Go through fork_exec.c of ALP_Chapter3
-------------------------------------------------------------------------------------------------------------------------
3.2.3 Process Scheduling
--->Linux schedules the parent and child processes independently, there’s no guarantee of
which one will run first, or how long it will run before Linux interrupts it and lets the
other process (or some other process on the system) run.

--->In particular, none, part, or all of the ls command may run in the child process before
the parent completes.

---> Linux promises that each process will run eventually—no process will be completely starved
of execution resources.

--->specify that a process is less important—and should be given a lower priority
—by assigning it a higher niceness value. By default, every process has a niceness of zero.

--->A higher niceness value means that the process is given a lesser execution priority;
conversely, a process with a lower (that is, negative) niceness gets more execution time.

--->To run a program with a nonzero niceness, use the nice command, specifying the niceness value with
 the -n option. For example, this is how you might invoke the command “sort input.txt > output.txt”,  a
long sorting operation, with a reduced priority so that it doesn’t slow down the system too much:

% nice -n 10 sort input.txt > output.txt

---> Use renice command to change the niceness of a running process from
the command line.

---> To change the niceness of a running process in a program , use the nice function.
Its argument is an increment value, which is added to the niceness value of the
process that calls it.

--->Remember that a positive value raises the niceness value and thus
reduces the process’s execution priority.

NOTE:
Note that only a process with root privilege can run a process with a negative niceness
value or reduce the niceness value of a running process.This means that you may
specify negative values to the nice and renice commands only when logged in as
root, and only a process running as root can pass a negative value to the nice function.
-----------------------------------------------------------------------------------------------
3.3 Signals
    Signals are mechanisms for communicating with and manipulating processes in Linux.
--->A signal is a special message sent to a process.

--->Signals are asynchronous; when a process receives a signal, it processes the signal immediately,
 without finishing the currentfunction or even the current line of code.

--->Each signal type is specified by its signal number,but in programs, you usually refer to a signal by
its name.In Linux, these are defined in /usr/include/bits/signum.h.

        For each signal, there is a default disposition, which determines what
happens to the process if the program does not specify some other behavior. For most
signal types, a program may specify some other behavior—either to ignore the signal
or to call a special signal-handler function to respond to the signal. If a signal handler is
used, the currently executing program is paused, the signal handler is executed, and,
when the signal handler returns, the program resumes.
The Linux system sends signals to processes in response to specific conditions. For
instance, SIGBUS (bus error), SIGSEGV (segmentation violation), and SIGFPE (floating
point exception) may be sent to a process that attempts to perform an illegal operation.
The default disposition for these signals it to terminate the process and produce a
core file.

        A process may also send a signal to another process. One common use of this
mechanism is to end another process by sending it a SIGTERM or SIGKILL signal.
Another common use is to send a command to a running program.Two “userdefined”
signals are reserved for this purpose: SIGUSR1 and SIGUSR2.The SIGHUP signal
is sometimes used for this purpose as well, commonly to wake up an idling program
or cause a program to reread its configuration files.

    struct sigaction {
    void     (*sa_handler)(int);
    void     (*sa_sigaction)(int, siginfo_t *, void *);
    sigset_t   sa_mask;
    int        sa_flags;
    void     (*sa_restorer)(void);
                                                 };

sigaction (SIGNAL, const struct *sigaction,struct sigaction *oldac);

        The sigaction function can be used to set a signal disposition.
--->The first parameter is the signal number.
--->The next two parameters are pointers to sigaction structures;
--->The first of these contains the desired disposition for that signal number
--->The second receives the previous disposition.

The most important field in the first or second sigaction structure is sa_handler.
It can take one of three values:
SIG_DFL, which specifies the default disposition for the signal.
SIG_IGN, which specifies that the signal should be ignored.
 A pointer to a signal-handler function.The function should take one parameter,
the signal number, and return void.

 NOTE:
      --->signals are asynchronous, the main program may be in a very fragile state
when a signal is processed and thus while a signal handler function executes.
Therefore, you should avoid performing any I/O operations or calling most library
and system functions from signal handlers.
      --->A signal handler should perform the minimum work necessary to respond to the
signal, and then return control to the main program (or terminate the program).
      --->It is possible for a signal handler to be interrupted by the delivery of another signal.
      --->While this may sound like a rare occurrence, if it does occur, it will be very difficult to
diagnose and debug the problem.

SIGTERM and SIGTERM
    --->The SIGTERM signal asks a process to terminate; the process mayignore the request by masking or ignoring
the signal.
    --->The SIGKILL signal always kills the process immediately since it is non-maskable.
    --->Assigning a value to a global variable can be dangerous because the assignment
may actually be carried out in two or more machine instructions, and a second signal
may occur between them, leaving the variable in a corrupted state.
     So Use sig_atomic_t type. Assignment gto this type are atomic;

-----------------------------------------------------------------------------------------------------------------------------
3.4 Process Termination
      Normally, a process terminates in one of two ways. Either the executing program calls
the exit function, or the program’s main function returns.
Each process has an exitcode: a number that the process returns to its parent.The exit code is
the argument passed to the exit function, or the value returned from main.A process may also terminate
abnormally, in response to a signal. For instance, the SIGBUS, SIGSEGV, and SIGFPE signals mentioned
previously cause the process to terminate.
 Other signals are used to terminate a process explicitly.The SIGINT signal is sent
to a process when the user attempts to end it by typing Ctrl+C in its terminal.The
SIGTERM signal is sent by the kill command.The default disposition for both of these
is to terminate the process.
By calling the abort function, a process sends itself the SIGABRT signal terminates the process and
produces a core file. SIGKILL cannot be blocked or handled by a program.

% kill -KILL pid / kill -s pid
To send a signal from a program, use the kill function.The first parameter is the target
process ID.The second parameter is the signal number; use SIGTERM to simulate the
default behavior of the kill command. For instance, where child pid contains the
process ID of the child process, you can use the kill function to terminate a child
process from the parent by calling it like this:
kill (child_pid, SIGTERM);

   Obtain the exit code of the most recently executed program using the special $? variable.
% ls /
bin coda etc lib misc nfs proc sbin usr
boot dev home lost+found mnt opt root tmp var
% echo $?
0
% ls bogusfile
ls: bogusfile: No such file or directory
% echo $?
Note : Use exit codes only between zero and 127. Exit codes above 128 have a special meaning—when
a process is terminated by a signal, its exit code is 128 plus the signal number.

---------------------------------------------------------------------------------------------------------------------
3.4.1 The wait System Calls
        wait() blocks the calling process until one of its child processes exits (or an error occurs).
It returns a status code via an integer pointer argument, from which you can extract information
about how the child process exited.For instance, the WEXITSTATUS macro extracts the child process’s exit code.

      Use the WIFEXITED macro to determine from a child process’s exit status
whether that process exited normally (via the exit function or returning from main)
or died from an unhandled signal.
In the latter case, use the WTERMSIG macro to extract from its exit status the signal number by which it died.

    The wait3 function returns CPU usage statistics about the exiting child process, and the wait4
function allows you to specify additional options about which processes to wait for.
Go through wait.c
---------------------------------------------------------------------------------------

3.4.2 Zombie Processes
    Child process terminates and the parent is not calling wait.
It will not be vanished No, because then information about its termination—such as
whether it exited normally and, if so, what its exit status is—would be lost. Instead,
when a child process terminates, is becomes a zombie process.
    A zombie process is a process that has terminated but has not been cleaned up yet. It
is the responsibility of the parent process to clean up its zombie children.The wait
functions do this, too, so it’s not necessary to track whether your child process is still
executing before waiting for it. Suppose, for instance, that a program forks a child
process, performs some other computations, and then calls wait. If the child process
has not terminated at that point, the parent process will block in the wait call until the
child process finishes. If the child process finishes before the parent process calls wait,
the child process becomes a zombie.When the parent process calls wait, the zombie
child’s termination status is extracted, the child process is deleted, and the wait call
returns immediately.
Go through zombie.c
------------------------------------------------------------------------------------------

3.4.4 Cleaning Up Children Asynchronously

One approach would be for the parent process to call wait3 or wait4 periodically,
to clean up zombie children. Calling wait for this purpose doesn’t work well because,
if no children have terminated, the call will block until one does. However, wait3 and
wait4 take an additional flag parameter, to which you can pass the flag value WNOHANG.
With this flag, the function runs in nonblocking mode—it will clean up a terminated
child process if there is one, or simply return if there isn’t.The return value of the call
is the process ID of the terminated child in the former case, or zero in the latter case.
A more elegant solution is to notify the parent process when a child terminates.
There are several ways to do this using the methods discussed in Chapter 5,
“Interprocess Communication,” but fortunately Linux does this for you, using signals.
When a child process terminates, Linux sends the parent process the SIGCHLD signal.

    Thus, an easy way to clean up child processes is by handling SIGCHLD. Of course,
when cleaning up the child process, it’s important to store its termination status if this
information is needed, because once the process is cleaned up using wait, that information
is no longer available. Go through example cleanup_child.c

Monday, February 11, 2008

today this

this site has been created today will service bloggs