ML之4PolyR:利用四次多项式回归4PolyR模型+两种正则化(Lasso/Ridge)在披萨数据集上拟合(train)、价格回归预测(test)

ML之4PolyR:利用四次多项式回归4PolyR模型+两种正则化(Lasso/Ridge)在披萨数据集上拟合(train)、价格回归预测(test)


输出结果

设计思路

核心代码

lasso_poly4 = Lasso()
lasso_poly4.fit(X_train_poly4, y_train)

ridge_poly4 = Ridge()
ridge_poly4.fit(X_train_poly4, y_train) 
class Lasso(ElasticNet):
    """Linear Model trained with L1 prior as regularizer (aka the Lasso)

    The optimization objective for Lasso is::

    (1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

    Technically the Lasso model is optimizing the same objective function as
    the Elastic Net with ``l1_ratio=1.0`` (no L2 penalty).

    Read more in the :ref:`User Guide <lasso>`.

    Parameters
    ----------
    alpha : float, optional
    Constant that multiplies the L1 term. Defaults to 1.0.
    ``alpha = 0`` is equivalent to an ordinary least square, solved
    by the :class:`LinearRegression` object. For numerical
    reasons, using ``alpha = 0`` with the ``Lasso`` object is not advised.
    Given this, you should use the :class:`LinearRegression` object.

    fit_intercept : boolean
    whether to calculate the intercept for this model. If set
    to false, no intercept will be used in calculations
    (e.g. data is expected to be already centered).

    normalize : boolean, optional, default False
    This parameter is ignored when ``fit_intercept`` is set to False.
    If True, the regressors X will be normalized before regression by
    subtracting the mean and dividing by the l2-norm.
    If you wish to standardize, please use
    :class:`sklearn.preprocessing.StandardScaler` before calling ``fit``
    on an estimator with ``normalize=False``.

    precompute : True | False | array-like, default=False
    Whether to use a precomputed Gram matrix to speed up
    calculations. If set to ``'auto'`` let us decide. The Gram
    matrix can also be passed as argument. For sparse input
    this option is always ``True`` to preserve sparsity.

    copy_X : boolean, optional, default True
    If ``True``, X will be copied; else, it may be overwritten.

    max_iter : int, optional
    The maximum number of iterations

    tol : float, optional
    The tolerance for the optimization: if the updates are
    smaller than ``tol``, the optimization code checks the
    dual gap for optimality and continues until it is smaller
    than ``tol``.

    warm_start : bool, optional
    When set to True, reuse the solution of the previous call to fit as
    initialization, otherwise, just erase the previous solution.

    positive : bool, optional
    When set to ``True``, forces the coefficients to be positive.

    random_state : int, RandomState instance or None, optional, default
     None
    The seed of the pseudo random number generator that selects a
     random
    feature to update.  If int, random_state is the seed used by the random
    number generator; If RandomState instance, random_state is the
     random
    number generator; If None, the random number generator is the
    RandomState instance used by `np.random`. Used when ``selection`` ==
    'random'.

    selection : str, default 'cyclic'
    If set to 'random', a random coefficient is updated every iteration
    rather than looping over features sequentially by default. This
    (setting to 'random') often leads to significantly faster convergence
    especially when tol is higher than 1e-4.

    Attributes
    ----------
    coef_ : array, shape (n_features,) | (n_targets, n_features)
    parameter vector (w in the cost function formula)

    sparse_coef_ : scipy.sparse matrix, shape (n_features, 1) |     (n_targets, n_features)
    ``sparse_coef_`` is a readonly property derived from ``coef_``

    intercept_ : float | array, shape (n_targets,)
    independent term in decision function.

    n_iter_ : int | array-like, shape (n_targets,)
    number of iterations run by the coordinate descent solver to reach
    the specified tolerance.

    Examples
    --------
    >>> from sklearn import linear_model
    >>> clf = linear_model.Lasso(alpha=0.1)
    >>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
    Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000,
    normalize=False, positive=False, precompute=False,
     random_state=None,
    selection='cyclic', tol=0.0001, warm_start=False)
    >>> print(clf.coef_)
    [ 0.85  0.  ]
    >>> print(clf.intercept_)
    0.15

    See also
    --------
    lars_path
    lasso_path
    LassoLars
    LassoCV
    LassoLarsCV
    sklearn.decomposition.sparse_encode

    Notes
    -----
    The algorithm used to fit the model is coordinate descent.

    To avoid unnecessary memory duplication the X argument of the fit
     method
    should be directly passed as a Fortran-contiguous numpy array.
    """
    path = staticmethod(enet_path)
    def __init__(self, alpha=1.0, fit_intercept=True, normalize=False,
        precompute=False, copy_X=True, max_iter=1000,
        tol=1e-4, warm_start=False, positive=False,
        random_state=None, selection='cyclic'):
        super(Lasso, self).__init__(alpha=alpha, l1_ratio=1.0,
         fit_intercept=fit_intercept, normalize=normalize,
         precompute=precompute, copy_X=copy_X, max_iter=max_iter, tol=tol,
         warm_start=warm_start, positive=positive, random_state=random_state,
         selection=selection)

######################################################
 #########################
# Functions for CV with paths functions
class Ridge(_BaseRidge, RegressorMixin):
    """Linear least squares with l2 regularization.

    This model solves a regression model where the loss function is
    the linear least squares function and regularization is given by
    the l2-norm. Also known as Ridge Regression or Tikhonov regularization.
    This estimator has built-in support for multi-variate regression
    (i.e., when y is a 2d-array of shape [n_samples, n_targets]).

    Read more in the :ref:`User Guide <ridge_regression>`.

    Parameters
    ----------
    alpha : {float, array-like}, shape (n_targets)
    Regularization strength; must be a positive float. Regularization
    improves the conditioning of the problem and reduces the variance of
    the estimates. Larger values specify stronger regularization.
    Alpha corresponds to ``C^-1`` in other linear models such as
    LogisticRegression or LinearSVC. If an array is passed, penalties are
    assumed to be specific to the targets. Hence they must correspond in
    number.

    fit_intercept : boolean
    Whether to calculate the intercept for this model. If set
    to false, no intercept will be used in calculations
    (e.g. data is expected to be already centered).

    normalize : boolean, optional, default False
    This parameter is ignored when ``fit_intercept`` is set to False.
    If True, the regressors X will be normalized before regression by
    subtracting the mean and dividing by the l2-norm.
    If you wish to standardize, please use
    :class:`sklearn.preprocessing.StandardScaler` before calling ``fit``
    on an estimator with ``normalize=False``.

    copy_X : boolean, optional, default True
    If True, X will be copied; else, it may be overwritten.

    max_iter : int, optional
    Maximum number of iterations for conjugate gradient solver.
    For 'sparse_cg' and 'lsqr' solvers, the default value is determined
    by scipy.sparse.linalg. For 'sag' solver, the default value is 1000.

    tol : float
    Precision of the solution.

    solver : {'auto', 'svd', 'cholesky', 'lsqr', 'sparse_cg', 'sag', 'saga'}
    Solver to use in the computational routines:

    - 'auto' chooses the solver automatically based on the type of data.

    - 'svd' uses a Singular Value Decomposition of X to compute the Ridge
    coefficients. More stable for singular matrices than
    'cholesky'.

    - 'cholesky' uses the standard scipy.linalg.solve function to
    obtain a closed-form solution.

    - 'sparse_cg' uses the conjugate gradient solver as found in
    scipy.sparse.linalg.cg. As an iterative algorithm, this solver is
    more appropriate than 'cholesky' for large-scale data
    (possibility to set `tol` and `max_iter`).

    - 'lsqr' uses the dedicated regularized least-squares routine
    scipy.sparse.linalg.lsqr. It is the fastest but may not be available
    in old scipy versions. It also uses an iterative procedure.

    - 'sag' uses a Stochastic Average Gradient descent, and 'saga' uses
    its improved, unbiased version named SAGA. Both methods also use an
    iterative procedure, and are often faster than other solvers when
    both n_samples and n_features are large. Note that 'sag' and
    'saga' fast convergence is only guaranteed on features with
    approximately the same scale. You can preprocess the data with a
    scaler from sklearn.preprocessing.

    All last five solvers support both dense and sparse data. However,
    only 'sag' and 'saga' supports sparse input when `fit_intercept` is
    True.

    .. versionadded:: 0.17
    Stochastic Average Gradient descent solver.
    .. versionadded:: 0.19
    SAGA solver.

    random_state : int, RandomState instance or None, optional, default
     None
    The seed of the pseudo random number generator to use when
     shuffling
    the data.  If int, random_state is the seed used by the random number
    generator; If RandomState instance, random_state is the random
     number
    generator; If None, the random number generator is the RandomState
    instance used by `np.random`. Used when ``solver`` == 'sag'.

    .. versionadded:: 0.17
    *random_state* to support Stochastic Average Gradient.

    Attributes
    ----------
    coef_ : array, shape (n_features,) or (n_targets, n_features)
    Weight vector(s).

    intercept_ : float | array, shape = (n_targets,)
    Independent term in decision function. Set to 0.0 if
    ``fit_intercept = False``.

    n_iter_ : array or None, shape (n_targets,)
    Actual number of iterations for each target. Available only for
    sag and lsqr solvers. Other solvers will return None.

    .. versionadded:: 0.17

    See also
    --------
    RidgeClassifier, RidgeCV, :class:`sklearn.kernel_ridge.KernelRidge`

    Examples
    --------
    >>> from sklearn.linear_model import Ridge
    >>> import numpy as np
    >>> n_samples, n_features = 10, 5
    >>> np.random.seed(0)
    >>> y = np.random.randn(n_samples)
    >>> X = np.random.randn(n_samples, n_features)
    >>> clf = Ridge(alpha=1.0)
    >>> clf.fit(X, y) # doctest: +NORMALIZE_WHITESPACE
    Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
    normalize=False, random_state=None, solver='auto', tol=0.001)

    """
    def __init__(self, alpha=1.0, fit_intercept=True, normalize=False,
        copy_X=True, max_iter=None, tol=1e-3, solver="auto",
        random_state=None):
        super(Ridge, self).__init__(alpha=alpha, fit_intercept=fit_intercept,
         normalize=normalize, copy_X=copy_X, max_iter=max_iter, tol=tol,
         solver=solver, random_state=random_state)

    def fit(self, X, y, sample_weight=None):
        """Fit Ridge regression model

        Parameters
        ----------
        X : {array-like, sparse matrix}, shape = [n_samples, n_features]
            Training data

        y : array-like, shape = [n_samples] or [n_samples, n_targets]
            Target values

        sample_weight : float or numpy array of shape [n_samples]
            Individual weights for each sample

        Returns
        -------
        self : returns an instance of self.
        """
        return super(Ridge, self).fit(X, y, sample_weight=sample_weight)
(0)

相关推荐